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Description 

[0001] The present invention relates to fusion partners which act as immunological fusion partners, as expression 
enhancers, and preferably to fusion partners having both functions. The invention also relates to fusion proteins containing 

5 them, to their manufacture, to their use in vaccines and to their use in medicines. In particular fusion partners are provided 
that contain a so-called choline binding domain, for example fusions comprising LytA from Streptococcus pneumoniae, 
or the pneumococcal phage CP1 lysozyme (CPU) wherein the choline binding domain is modified to include a heter- 
ologous T-helper epitope. Such fusion partners are shown to improve the expression level of the heterologous protein 
attached thereto and also find particular utility when fused to poorly immunogenic proteins or peptides that are otherwise 

io useful as vaccine antigens. More particularly, such fusion partners are useful in constructs comprising self-antigens, eg 
tumour specific or tissue specific antigens. 

Background to the invention 

15 [0002] Streptococcus pneumoniae synthesises an N acetyl-L-alanine amidase, LytA, an autolysin that specifically 
degrades the peptidoglycan backbone of the cell wall eventually leading to cell lysis. Its polypeptide chain has two 
domains. The N-terminal domain is responsible for the catalytic activity, whereas the C-terminal domain of LytA is 
responsible for the affinity to choline and anchorage to the cell wall. This C-terminal domain is known to bind to choline 
and choline analogues, and will also bind to tertiary amines such as DEAE (diethyl amino ethyl) commonly used in 

20 chromatography. 

[0003] LytA is a 318 amino acid protein, and the C-terminal part comprises a tandem of six imperfect repeats of 20 
or 21 amino acids and a short COOH-terminal tail. The repeats are located at the following positions: 

R1: 177-191 
25 R2: 192-212 

R3: 213-234 
R4: 235-254 
R5: 255-275 
R6: 276-298 

30 

[0004] These repeats are predicted to be in a beta-turn conformation. The C-terminus is responsible for binding choline. 
Likewise the C-terminus of CPL1 is responsible for binding affinity and the aromatic residues in the repeat contribute to 
such binding. These proteins have been used as affinity tags to allow for rapid purification (Sanchez Puelles, Eur J 
Biochem. 1992, 203, 153-9). 

35 [0005] Other proteins with a choline-binding domain have also been studied in Streptococcus pneumoniae. 

[0006] One of them PspA (or Pneumococcal Surface Protein A), is a virulence factor (Yother J and Briles (1992) J 
Bacteriol 174(2) p 601). This protein is antigenic and immunogenic. It has a C-terminal domain consisting of 10 repeats 
of 20 amino acids, homologous with repeats of LytA. 

[0007] CbpA (or Choline-Binding Protein A) is involved in the adherence of the pneumococcus to human cells (Rosenow 
40 et al (1 997) Mol Microbiol 25 (5) p 81 9). It shows 1 0 repeats of 20 amino acids in the C-terminal domain which are almost 
identical to those of PspA. 

[0008] LytB and LytC have a different modular organisation from the above-mentioned proteins as their choline-binding 
domain, made up of 15 repeats and 1 1 repeats respectively, is situated at the N-terminal end, not at the C-terminal end 
(Garcia P Mol Microbiol (1999) 31 (4) p1275 and Garcia Petal (1999) Mol Microbiol 33(1) p128). Sequence comparison 
45 shows LytB to have glucosamidase activity. LytC shows in vitro a lysozyme-type activity. 

Additionally, three genes called PepA, PepB and PepC were cloned in 1995. Although their function is unknown, these 
genes also have a variable number of repeats homologous to those of LytA. 

[0009] In their infection cycle, phages synthesise murein hydrolases facilitating their passage into the bacterium. These 
hydrolases have a choline-binding domain. 

so The muramidase CPL1 of the phage Cp-1 has been well studied. It shows 6 repeats of 20 amino acids at the C-terminus 
involved in the specific recognition of choline (Garica J. L. J. Virol 61 (8) p2573-80; (1987) and Garcia E Prol Natl Acad 
Sci (1988) p914). A comparison of the LytA and CPL1 repeats enables an initial consensus of those repeats to be made. 
[0010] The murein hydrolases of phages Dp-1 (Garcia P et al (1983) J Gen Microbiol 129 (2) p489, Cpl-9 (Garcia P 
et al (1989) Biochem Biophys Res Commun 158(1) p 251, HB-3 Romero et al 1990 J Bacteriol 172 (9) p 5064-5070) 

55 and EJ-1 Diaz (1992) J Bacteriol 174 (17) p 5516), also show the characteristics of choline-binding domains. 
[0011] This property is also shared by the lysozyme encoded by CP-1 a pneumococal phage. 
WO 99/10375 describes inter alia, human papilloma virus proteins E6, or E7 linked to a His tag and the C-terminal 
portion of LytA (herein (C-LytA) and the purification of the proteins by differential affinity chromatography. 
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WO 99/40188 describes inter alia fusion proteins comprising MAGE antigens with a His tails and a C-LytA portion at 
the N-terminus of the molecule. 

[0012] It has now been surprisingly found that fusion partners according to the present invention, when fused to a 
heterologous protein were capable of enhancing the immunogenicity of the heterologous proteins attached thereto. It 
5 has also been found that the expression level of the heterologous proteins attached thereto can be enhanced. The 
present invention accordingly provides in a preferred embodiment an improved immunological fusion partner which can 
also act as an expression enhancer. 

Summary of the invention 

10 

[0013] Accordingly the present invention comprises a fusion partner molecule comprising a choline binding domain 
or a fragment thereof or an analogue thereof, and a heterologous promiscuousThelperepitope, preferably a promiscuous 
MHC Class II T-epitope. Said fusion partner shows a capability of acting as both an immunological fusion partner, or as 
an expression enhancer and preferably as both an immunological partner and expression enhancer. A promiscuous T- 

15 helper epitope is an epitope that binds to more than one MHC Class II allele, preferably more than 3 MHC Class II alleles. 
In particular such epitopes are capable of eliciting helper T cell response in large numbers of individuals expressing 
diverse MHC haplotypes. Optionally, the fusion protein may retain its capability to bind to choline. 
[0014] In a preferred embodiment the choline binding moiety is derived from the C terminus of LytA. Preferably the 
C-LytA or derivatives comprises at least four repeats of any of the repeats R1 to R6 set forth in figure 1 {SEQ ID NO:1 

20 to 6). In a most preferred embodiment, the C-LytA extends from amino acid 1 77-298 which contains a portion of the first 
repeat and the complete five others. 

[0015] In a further aspect of the invention, there is provided a fusion partner as herein defined further comprising a 
heterologous protein. The heterologous protein may be either chemically conjugated or fused to the fusion partner. 
Preferably the heterologous protein is a tumour-associated antigen or immunogenic fragment thereof. 
25 [0016] In a further aspect of the invention there is provided a nucleic acid sequence encoding the proteins as herein 
defined. There is also provided an expression vector comprising said nucleic acid, and a host transformed with said 
nucleic acid or vector. 

[0017] In a further aspect of the invention there is provided an immunogenic composition comprising a protein or a 
nucleic acid sequence as herein described, and a pharmaceutically acceptable excipient, diluent or carrier. Preferably 
3D the immunogenic composition further comprises a Th-1 inducing adjuvant. 

[0018] In yet a further embodiment, the invention provides the immunogenic composition or protein and nucleic acids 
for use in medicine. In particular, there is provided a protein or a nucleic acid of the invention, in the manufacture of a 
medicament for eliciting an immune response in a patient, or for use in the treatment or prophylaxis of infectious diseases 
or cancer diseases. 

35 [0019] The invention further provides for methods of treating a patient suffering from an infectious disease or a cancer 
disease, particularly carcinoma of the breast, lung (particularly non - small cell lung carcinoma), colorectal, ovarian, 
prostate, gastric and other Gl (gastrointestinal) by the administration of a safe and effective amount of a composition or 
nucleic acid as herein described. 

[0020] In yet a further embodiment the invention provides a method of producing an immunogenic composition as 
40 herein described by admixing a nucleic acid or protein of the invention with a pharmaceutically acceptable excipient, 
diluent or carrier. 

Detailed description of the invention 

45 [0021] As described therein, in one embodiment of the present invention the modified choline binding domain (fusion 
partner) has a capability of acting as an expression enhancer with the resulting fusion protein will be expressed at a 
higher yield in a host cell as compared to the unfused protein, preferably at a yield greater than about 100% (2-fold 
higher) or 150% or more, as measured by SDS-PAGE followed by Coomassie blue staining or silver staining, optionally 
followed by gel scanning. The modified choline binding domain according to the invention has also the capability of 

50 acting as an immunological partner with the resulting fusion protein with a heterologous protein will be more immunogenic 
in a host as compared to the unfused heterologous protein. 

[0022] In another embodiment of the present invention, the modified choline binding domain has the capability to act 
as an immunological fusion partner, allowing an enhanced immune response to be obtained with the fusion protein as 
compared to the heterologous protein alone. 
55 [0023] In a preferred embodiment, the modified choline binding domain has a dual function, having the capability to 
act as both an immunological fusion partner and as an expression enhancer. 

[0024] In a preferred embodiment the choline binding moiety is derived from the C terminus of LytA. Preferably the 
C-LytA or derivatives comprises at least two repeats, preferably at least four repeats. In this context, C-LytA derivatives 
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refer to a variant of C-LytA according to the present invention, that is to say variants which have retained both the 
capability of acting as an immunological partner and an expression enhancer. Preferred variants include, for example, 
peptides comprising an amino acid sequence having at least 85% identity, preferably at least 90% identity, more preferably 
at least 95% identity, most preferably at least 97-99% identity, to any of the repeats R1 to R6 set forth in figure 1 (SEQ 
5 ID N0:1 to 6), or a peptide comprising an amino acid sequence having at least 15, 20, 30, 40, 50 or 100 contiguous 
amino acids from the amino acid sequence set forth in figure 1 (SEQ ID NO:1 to 8). 

[0025] Accordingly, in one aspect of the invention there is provided a fusion partner protein comprising a modified 
choline binding domain and a heterologous promiscuous T helper epitope, wherein the choline binding domain is selected 
from the group comprising: 

10 

a) the C-terminal domain of LytA as set forth in SEQ ID NO:7; 

b) the sequence of SEQ ID NO:8; 

c) a peptide sequence comprising an amino acid sequence having at least 85% identity, preferably at least 90% 
identity, more preferably at least 95% identity, most preferably at least 97-99% identity, to any of SEQ ID NO: 1 to 6; 

15 d) a peptide sequence comprising an amino acid sequence having at least 15, 20, 30, 40, 50 or 100 contiguous 

amino acids from the amino acid sequence of SEQ ID NO:7 or SEQ ID NO:8. 

In a most preferred embodiment, the C-LytA extends from amino acid 1 77-298 which contains a portion of the first repeat 
and the complete five others, as set forth in figure 1 . 
20 [0026] The second component of the fusion partner, the heterologous T-cell epitope is preferably selected from the 
group of epitopes that will bind to a number of individuals expressing more than one MHC II molecules in humans. For 
example, epitopes that are specifically contemplated are P2 and P30 epitopes from tetanus toxoid, Panina - Bordignon 
Eur. J. Immunol 19 (12), 2237 (1989). In a preferred embodiment the heterologous T-cell epitope is P2 or P30 from 
Tetanus toxin. 

25 [0027] The P2 epitope has the sequence QYIKANSKFIGITE and corresponds to amino acids 830-843 of the Tetanus 
toxin. The P30 epitope (residues 947-967 of Tetanus Toxin) has the sequence FNNFTVSFWLRVPKVSASHLE. The 
FNNFTV sequence may optionally be deleted. Other universal T epitopes can be derived from the circumsporozoite 
protein from Plasmodium falciparum - in particular the region 378-398 having the sequence DIEKKIAKMEKASSVFNWNS 
(Alexander J, (1994) Immunity 1 (9), p 751-761). Another epitope is derived from Measles virus fusion protein at residue 

so 288-302 having the sequence LSEIKGVIVHRLEGV (Partidos CD, 1990, J. Gen. Virol 71(9) 2099-2105). Yet another 
epitope is derived from hepatitis B virus surface antigen, in particular amino acids, having the sequence FFLL- 
TRILTIPQSLD. Another set of epitopes is derived from diphteria toxin. Four of these peptides (amino acids 271-290, 
321-340, 331-350, 351-370) map within the T domain of fragment B of the toxin, and the remaining 2 map in the R 
domain (411-430, 431-450): 

35 

PVFAGANYAAWAVNVAQVI 
VHHNTEEIVAQSIALSSLMV 
QSIALSSLMVAQAIPLVGEL 
VDIGFAAYNFVESII NLFQV 
40 QGESGHDIKITAENTPLPIA 
GVLLPTIPGKLDVNKSKTHI 

(Raju R., Navaneetham D., Okita D., Diethelm-Okita B., McCormick D., Conti-Fine B. M. (1995) Eur. J. Immunol. 
25: 3207-14.) 

45 [0028] The heterologous T-epitope is preferably fused to C-LytA containing at least 4 repeats, preferably repeat 2 -5 
inclusive. One or more subsequent repeats may optionally be fused to the C-terminus of the T-epitope. Alternatively, 
the heterologous T-epitope is preferably inserted between two consecutive repeats of C-LytA containing a total of at 
least 4 repeats, or inserted into one of the repeats of C-LytA containing a total of at least 4 repeats. More preferably, 
the C-LytA contains 6 repeats and the heterologous epitope is inserted within and at the beginning of the sixth repeat 

so of C-LytA. 

[0029] The present invention further provides, in other aspects, fusion proteins that comprise at least one polypeptide 
as described above, as well as polynucleotides encoding such fusion proteins, typically in the form of pharmaceutical 
compositions, e.g., vaccine compositions, comprising a physiologically acceptable carrier and/or an immunostimulant. 
Thus a self-protein or other poorly immunogenic protein may be fused to either the N or C terminal end of the resulting 
55 fusion partner Alternatively the self protein or poorly immunogenic protein may be inserted into the fusion partner. In 
an optional embodiment a histidine tag or at least four, preferably more than 6 histidine residues, may be fused to the 
alternative end of the poorly immunogenic protein. This would allow forthe protein to be purified by affinity chromatography 
steps, as a histidine tail, typically comprising at least four, preferably six or more residues binds to metal ions and 
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therefore is suitable for metai immobilised metal ion affinity chromatography (IMAC). 
Typical constructs would therefore comprise: 

Poorly- immunogenic protein - C-LytA repeats^ -P 2 epitope (inserted in or replacing C-LytA repeat 5 )~C-LytA repeats 

C-LytA repeats-,.4 -P 2 epitope (inserted in or replacing C-LytA repeat 5 ) - C-LytA repeat 6 -Poorly immunogenic protein 

Poorly immunogenic protein - C-LytA repeat 2 . 5 -P 2 epitope (inserted into C-LytA repeats) 

C-LytA 2 . 5 -P 2 epitope (inserted into C-LytA repeat^)- Poorly immunogenic protein. 

Poorly immunogenic protein C-LytA repeats 1 . 5 -P 2 epitope- inserted in C-LytA repeat 6 

C-LytA repeats 1 . 5 -P 2 epitope- inserted in C-LytA repeat 6 - Poorly immunogenic protein 

Poorly immunogenic protein- P 2 epitope inserted into C-LytA repeat r C-LytA repeats 2 . s 

P 2 epitope inserted into C-LytA repeat! -C-LytA repeats 2 . 5 - Poorly immunogenic protein 

Poorly immunogenic protein- P 2 epitope inserted into C-LytA repeat-, -C-LytA repeats 2 - 6 

P 2 epitope inserted into C-LytA repeat! -C-LytA repeats 2 . 6 - Poorly immunogenic protein 

Poorly immunogenic protein-C-LytA repeat r P 2 epitope inserted into C-LytA repeat 2 -C-LytA repeats 3 _ 6 

C-LytA repeat 1 -P 2 epitope inserted into C-LytA repeat 2 -C-LytA repeats 3 . 5 - Poorly immunogenic protein; 

where "inserted into" means at any place into the repeat for example between residue 1 and 2, or between 2 and 3, etc. 
[0030] The promiscuous T helper epitope may be inserted within a repeat region for example C-LytA repeats 2 _ 5 - C- 
LytA repeat 6a-P 2 epitope - C-LytA repeat 6b, where the P2 epitope is inserted within the sixth repeat (see figure 2). 
[0031] In other preferred embodiments the C-terminal end of CPL1 (C-CPL1 ) may be used as an alternative to C-LytA. 
[0032] Alternatively, the P2 epitope in the above constructs may be replaced by other promiscuous T epitopes, for 
example P30. In an embodiment of the invention, two or more promiscuous epitopes are part of the fusion construct. It 
is however preferred to keep the fusion partner as small as possible, thus limiting the number of potentially interfering 
CD8+ and B epitopes. Thus the fusion partner is preferably no bigger than 100-140 amino acids, preferably no bigger 
than 120 amino acids, typically about 100 amino acid. 

[0033] The antigen to which the fusion partner is fused may be from bacterial, viral, protozoan, fungal or mammalian, 
including human, sources. 

[0034] The fusion partner of the present invention are preferably fused to a self antigen such as a tumour associated 
or tissue specific antigens such as those for prostrate, breast, colorectal, lung, pancreatic, ovarian, renal or melanoma 
cancers. Fragments of said self or tumour antigens are expressly contemplated to be fused to the fusion partner of the 
invention. Typically the fragment will contain at least 20, preferably 50, more preferably 100 contiguous amino acids of 
the full-length sequence. Typically such fragments will be devoid of one or more transmembrane domains or may have 
N-terminal or C-terminal deletions of about 3, 5 , 8, 10, 15, 20, 28 , 33, 50, 54 amino acids. Such fragments will, when 
suitably presented, be able to generate immune responses that recognise the full length protein. Particularly illustrative 
polypeptides of the present invention comprise a sequence of at least 10 contiguous amino acids, preferably 20, more 
preferably 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180 amino acids of a tumour associated 
or tissue specific protein fused to the fusion partner. 

[0035] The polypeptides of the invention are immunogenic, i.e., they react detectably within an immunoassay (such 
as an ELISA or T-cell stimulation assay) with antisera and/or T-cells from a patient with cripto expressing cancer. 
Screening for immunogenic activity can be performed using techniques well known to the skilled artisan. For example, 
such screens can be performed using methods such as those described in Harlow and Lane, Antibodies; A Laboratory 
Manual, Cold Spring Harbor Laboratory, 1988. In one illustrative example, a polypeptide may be immobilised on a solid 
support and contacted with patient sera to allow binding of antibodies within the sera to the immobilised polypeptide. 
Unbound sera may then be removed and bound antibodies detected using, for example. 125 l-labeled Protein A. As would 
be recognised by the skilled artisan, immunogenic portions of tumour associated or tumour specific antigen are also 
encompassed by the present invention. An "immunogenic portion" as used herein, is a fragment that itself is immuno- 
logically reactive (i.e., specifically binds) with the B-cells and/or T-cell surface antigen receptors that recognize the 
polypeptide. Immunogenic portions may generally be identified using well known techniques, such as those summarized 
in Paul, Fundamental immunology, 3rded., 243-247 (Raven Press, 1993) and references cited therein. Such techniques 
include screening polypeptides for the ability to react with antigen-specific antibodies, antisera and/or T-cell lines or 
clones. As used herein, antisera and antibodies are "antigen-specific" if they specifically bind to an antigen (i.e., they 
react with the protein in an ELISA or other immunoassay, and do not react detectably with unrelated proteins). Such 
antisera and antibodies may be prepared as described herein, and using well-known techniques. In one preferred 
embodiment, an immunogenic portion of a polypeptide is a portion that reacts with antisera and/or T-cells at a level that 
is not substantially less than the reactivity of the full-length polypeptide (e.g., in an ELISA and/or T-cell reactivity assay). 
Preferably, the level of immunogenic activity of the immunogenic portion is at least about 50%, preferably at least about 
70% and most preferably greater than about 90% of the immunogenicity for the full-length polypeptide. In some instances, 
preferred immunogenic portions will be identified that have a level of immunogenic activity greater than that of the 
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corresponding full-length polypeptide, e.g., having greater than about 100% or 150% or more immunogenic activity. 
[0036] In certain other embodiments, illustrative immunogenic portions may include peptides in which an N-terminal 
leader sequence and/or transmembrane domain have been deleted. Other illustrative immunogenic portions will contain 
a small N- and/or C-terminal deletion (e.g., about 1-50 amino acids, preferably about 1-30 amino acids, more preferably 

5 about 5-15 amino acids), relative to the mature protein. 

[0037] Exemplary antigens or fragments derived therefrom include MAGE 1 , Mage 3 and MAGE 4 or other MAGE 
antigens such as disclosed in WO 99/40188, PRAME (WO 96/10577), BAGE, RAGE, LAGE 1 (WO 98/32855), LAGE 
2 (also known as NY-ESO-1 , WO 98/14464), XAGE (Liu et al, Cancer Res, 2000, 60:4752-4755; WO 02/1 8584) SAGE, 
and HAGE (WO 99/53061 ) or GAGE (Robbins and Kawakami, 1996, Current Opinions in Immunology 8, pps 628-636; 

10 Van den Eynde et al., International Journal of Clinical & Laboratory Research (submitted 1997); Correale et al. (1997), 
Journal of the National Cancer Institute 89, p293. Indeed these antigens are expressed in a wide range of tumourtypes 
such as melanoma, lung carcinoma, sarcoma and bladder carcinoma. 

[0038] In a preferred embodiment prostate antigens are utilised, such as Prostate specific antigen (PSA), PAP, PSCA 
(PNAS 95(4) 1735 -1740 1998), PSMA or the antigen known as prostase. 

15 [0039] In a particularly preferred embodiment, the prostate antigen is P501 S or a fragment thereof. P501 S, also named 
prostein (Xu et al. , Cancer Res. 61 , 2001 , 1563-1568), is known as SEQ ID NO. 1 13 of W098/37814 and is a 553 amino 
acid protein. Immunogenic fragments and portions thereof comprising at least 20, preferably 50, more preferably 100 
contiguous amino acids as disclosed in the above referenced patent application and are specifically contemplate by the 
present invention. Preferred fragments are disclosed in WO 98/50567 (PS108 antigen) and as prostate cancer-associated 

20 protein (SEQ ID NO: 9 of WO 99/67384). Other preferred fragments are amino acids 51 -553, 34-553 or 55-553 of the 
full-length P501 S protein. In particular, construct 1 , 2 and 3 (see figure 2, SEQ ID NOs. 27-32) are expressly contemplated, 
and can be expressed in yeast systems, for example DNA sequences encoding such polypeptides can be expressed 
in yeast system. 

[0040] Prostase is a prostate-specific serine protease (trypsin-like), 254 amino acid-long, with a conserved serine 
25 protease catalytic triad H-D-S and a amino-terminal pre-propeptide sequence, indicating a potential secretory function 
(P. Nelson, Lu Gan, C. Ferguson, P. Moss, R. linas, L. Hood & K. Wand, "Molecular cloning and characterisation of 
prostase, an androgen-regulated serine protease with prostate restricted expression. In Proc. Natl. Acad. Sci. USA 
(1999) 96, 31 14-31 19). A putative glycosylate site has been described. The predicted structure is very similar to other 
known serine proteases, showing that the mature polypeptide folds into a single domain. The mature protein is 224 
3D amino acids-long, with one A2 epitope shown to be naturally processed. Prostase nucleotide sequence and deduced 
polypeptide sequence and homologous are disclosed in Ferguson, etal. (Proc. Natl. Acad. Sci. USA 1999, 96,3114-3119) 
and in International Patent Applications No. WO 98/12302 (and also the corresponding granted patent US 5,955,306), 
WO 98/201 1 7 (and also the corresponding granted patents US 5,840,871 and US 5,786, 1 48) (prostate-specific kallikrein) 
and WO 00/04149 (P703P). 

35 [0041] Other prostate specific antigens are known from W098/3741 8, and WO/004149. Another is STEAP (PNAS 96 
14523 14528 7-12 1999). 

[0042] Other tumour associated antigens useful in the context of the present invention include: Plu -1 J Biol. Chem 
274 (22) 1 5633 -1 5645, 1 999, HASH -1 , HASH-2 (Alders.M. et al., Hum. Mol. Genet. 1 997, 6, 859-867). Cripto (Salomon 
et al Bioessays 199, 21 61 -70.US patent 5654140), CASB616 (WO 00/53216), Criptin (US 5,981,215). Additionally, 

40 antigens particularly relevant for vaccines in the therapy of cancer also comprise tyrosinase, telomerase, P53, NY-Br1 . 1 
(WO 01/47959) and fragments thereof such as disclosed in WO 00/43420, B726 (WO 00/60076, SEQ ID nos 469 and 
463; WO 01/79286, SEQ ID nos 474 and 475), P510 (WO 01/34802 SEQ ID nos 537 and 538) and survivin. 
[0043] The present invention is also useful in combination with breast cancer antigens such as Her-2/neu, mamma- 
globin (US patent 5,668,267), B305D (WO 00/61753 SEQ ID nos 299, 304, 305 and 315), or those disclosed in WO 

45 00/52165, WO 99/33869, WO 99/19479, WO 98/45328. Her-2/neu antigens are disclosed inter alia, in US patent 
5,801,005. Preferably the Her-2/neu comprises the entire extracellular domain (comprising approximately amino acid 
1-645) or fragments thereof and at least an immunogenic portion of or the entire intracellular domain approximately the 
C terminal 580 amino acids. In particular, the intracellular portion should comprise the phosphorylation domain or frag- 
ments thereof. Such constructs are disclosed in WO 00/44899. A particularly preferred construct is known as ECD-PhD, 

50 a second is known as ECD deltaPhD (see WO 00/44899). The Her-2/neu as used herein can be derived from rat, mouse 
or human. 

[0044] Certain tumour antigens are small peptide antigens (ie less than about 50 amino acids). These antigens can 
be chemically conjugated to the modified choline binding protein of the present invention. 

[0045] Exemplary peptides included Mucin derived peptides such as MUC-1 (see for example US 5,744,144; US 
55 5,827,666; WO 88/05054, US 4,963,484). Specifically contemplated are MUC-1 derived peptides that comprise at least 
one repeat unit of the MUC-1 peptide, preferably at least two such repeats and which is recognised by the SM3 antibody 
(US 6,054,438). Other mucin derived peptides include peptide from MUC-5. 

[0046] Alternatively, said antigen is an interleukin such as IL13 and IL14, which are preferred. Or said antigen maybe 
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a self peptide hormone such as whole length Gonadotrophin hormone releasing hormone (GnRH, WO 95/20600), a 
short 10 amino acid long peptide, useful in the treatment of many cancers, or in immunocastration. 
[0047] Other tumour-specific antigens are suitable to be coupled with the modified Choline binding protein of the 
present invention include, but are not restricted to tumour-specific gangliosides such as GM2, and GM3. 

5 [0048] The covalent coupling of the peptide to modified choline binding protein can be carried out in a manner well 
known in the art. Thus, for example, for direct covalent coupling it is possible to utilise a carbodiimide, glutaraldehyde 
or (N-[y-maleimidobutyryloxy] succinimide ester, utilising common commercially available heterobifunctional linkers such 
as CDAP and SPDP (using manufacturers instructions). Afterthe coupling reaction, the immunogen can easily be isolated 
and purified by means of a dialysis method, a gel filtration method, a fractionation method etc. 

w [0049] The antigen may also be derived from sources which are pathogenic to humans, such as such as Human 
Immunodeficiency virus HIV-1 (such as tat, nef, reverse transcriptase, gag, gp120 and gp160), human herpes simplex 
viruses, such as gD or derivatives thereof or Immediate Early protein such as ICP27 from HSV1 or HSV2, cytomegalovirus 
((esp Human)(such as gB or derivatives thereof), Rotavirus (including live-attenuated viruses), Epstein Barr virus (such 
as gp350 or derivatives thereof), Varicella Zoster Virus (such as gpl, II and IE63), or from a hepatitis virus such as 

15 hepatitis B virus (for example Hepatitis B Surface antigen or a derivative thereof), hepatitis A virus, hepatitis C virus and 
hepatitis E virus, or from other viral pathogens, such as paramyxoviruses: Respiratory Syncytial virus (such as F and G 
proteins or derivatives thereof), parainfluenza virus, measles virus, mumps virus, human papilloma viruses (for example 
HPV6, 11, 16, 18, ..), flaviviruses (e.g. Yellow Fever Virus, Dengue Virus, Tick-borne encephalitis virus, Japanese 
Encephalitis Virus) or Influenza virus (whole live or inactivated virus, split influenza virus, grown in eggs or MDCK cells, 

20 or whole flu virosomes (as described by R. Gluck, Vaccine, 1992, 10, 915-920) or purified or recombinant proteins 
thereof, such as HA, NP, NA, or M proteins, or combinations thereof), or derived from bacterial pathogens such as 
Neisseria spp, including N. gonorrheaand N. meningitidisifor example capsular polysaccharides and conjugates thereof, 
transferrin-binding proteins, lactoferrin binding proteins, PIIC, adhesins); S. pyogenes (for example M proteins or frag- 
ments thereof, C5A protease, lipoteichoic acids), S. agaiactiae, S. mutans; H. ducreyi; Moraxella spp, including M 

25 catarrhalis, also known as Branhamella catarrhalis (for example high and low molecular weight adhesins and invasins); 
Bordetella spp, including B. pertussis (for example pertactin, pertussis toxin or derivatives thereof, filamenteous hemag- 
glutinin, adenylate cyclase, fimbriae), 6. parapertussis and B. bronchiseptica; Mycobacterium spp. , including M. tuber- 
culosis (for example ESAT6, Antigen 85A, -Bor-C), M. bovis, M. leprae, M. avium, M. paratuberculosis, M. smegmatis; 
Legionella spp, including L. pneumophila; Escherichia spp, including enterotoxicE. co//(for example colonization factors, 

3D heat-labile toxin or derivatives thereof, heat-stable toxin or derivatives thereof), enterohemorragic E. coli, enteropatho- 
genic £. coli (for example Shiga toxin-like toxin or derivatives thereof); Vibrio spp, including V. cholera (for example 
cholera toxin orderivatives thereof); Shigella spp, including S. sonnei, $. dysenteriae, $. flexnerii; Yersinia spp, including 
Y. enterocolitica (for example a Yop protein) , Y. pestis, Y. pseudotuberculosis; Campylobacter spp, including C. jejuni 
(for example toxins, adhesins and invasins) and C. coli, Salmonella spp, including S typhi, S. paratyphi, S. choleraesuis, 

35 s. enteritidis; Listeria spp., including L. monocytogenes; Helicobacter spp, including H. pylori (for example urease, 
catalase, vacuolating toxin); Pseudomonas spp, including P. aeruginosa; Staphylococcus spp., including S. aureus, S. 
epidermidis: Enterococcus spp., including E faecalis, E. faecium; Clostridium spp., including C tetani (for example 
tetanus toxin and derivative thereof), C. botulinum (for example botulinum toxin and derivative thereof), C. difficile (for 
example Clostridium toxins A or B and derivatives thereof); Bacillus spp., including B. anthracis (for example botulinum 

40 toxin and derivatives thereof); Corynebacterium spp., including C. diphtheriae (for example diphtheria toxin and deriv- 
atives thereof); Borrelia spp. , including B. burgdorferi (for example OspA, OspC, DbpA, DbpB), B. garinii (for example 
OspA, OspC, DbpA, DbpB), B. afzelii (for example OspA, OspC, DbpA, DbpB), B. andersonii (for example OspA, OspC,- 
DbpA, DbpB), B. hermsii, Ehrlichia spp., including E. equ/andthe agent of the Human Granulocytic Ehrlichiosis; Rickettsia 
spp, including R. rickettsii; Chlamydia spp.. including C. trachomatis (for example MOMP, heparin-binding proteins), C. 

45 pneumoniae (for example MOMP, heparin-binding proteins), C. psittaci; Leptospira spp., including L. interrogans; 
Treponema spp., including 7. pallidum (for example the rare outer membrane proteins), 7. denticola, T. hyodysenteriae; 
or derived from parasites such as Plasmodium spp., including P. falciparum, Toxoplasma spp., including 7. gondii (for 
example SAG2, SAG3, Tg34); Entamoeba spp. , including E. histolytica; Babesia spp., including B. microti; Trypanosoma 
spp., including 7, cruzi; Giardia spp., including G. lamblia; Leshmania spp., including L. major; Pneumocystis spp., 

50 including P. carinii; Trichomonas spp., including 7. vaginalis; Schisostoma spp., including S. mansoni, or derived from 
yeast such as Candida spp., including C. albicans; Cryptococcus spp., including C. neoformans. 
[0050] Other preferred specific antigens for M. tuberculosis are for example Tb Ral2, Tb H9, Tb Ra35, Tb38-1, Erd 
14, DPV, MTI, MSL, mTTC2 and hTCC1 (WO 99/51748). Proteins for M. tuberculosis also include fusion proteins and 
variants thereof where at least two, preferably three polypeptides of M. tuberculosis are fused into a larger protein. 

55 Preferred fusions include Ra12-TbH9-Ra35, Erd14-DPV-MTI, DPV-MTI-MSL, Erdl4-DPV-MTI-MSL-mTCC2, Erd14- 
DPV-MTI-MSL, DPV-MTI-MSL-mTCC2, TbH9-DPV-MTI (WO 99/51748). 

[0051] Most preferred antigens for Chlamydia include for example the High Molecular Weight Protein (HWMP) (WO 
99/17741), ORF3 (EP 366 412), and putative membrane proteins (Pmps). Other Chlamydia antigens of the vaccine 
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formulation can be selected from the group described in WO 99/28475. 

[0052] Preferred bacterial antigens are derived from Streptococcus spp, including S. pneumoniae (for example cap- 
sular polysaccharides and conjugates thereof, PsaA, PspA, streptolysin, choline-binding proteins) and the protein antigen 
Pneumolysin (Biochem BiophysActa, 1989, 67, 1007; Rubins etal., Microbial Pathogenesis, 25, 337-342), and mutant 
5 detoxified derivatives thereof (WO 90/06951 ; WO 99/03884). Other preferred bacterial antigens are derived from Hae- 
mophilus spp., including H. influenzae type B (for example PRP and conjugates thereof), non typeable H. influenzae, 
for example OMP26, high molecular weight adhesins, P5, P6, protein D and lipoprotein D, and fimbrin and fimbrin derived 
peptides (US 5,843,464) or multiple copy varients or fusion proteins thereof. 

[0053] Derivatives of Hepatitis B Surface antigen are well known in the art and Include, inter alia, those PreS1 , PreS2 
10 S antigens set forth described in European Patent applications EP-A-414 374; EP-A-0304 578, and EP 198-474. In one 

preferred The HBV antigen is HBV polymerase (Ji Hoon Jeongetal, 1996, BBRC 223, 264-271 ; Lee H.J. etal , Biotechnol. 

Lett. 15, 821-826). In another preferred aspect the antigen within the fusion is a HIV-1 antigen, gp120, especially when 

expressed in CHO cells. In a further embodiment, antigen comprises gD2t as hereinabove defined. 

[0054] In a preferred embodiment of the present invention fusions comprise an antigen derived from the Human 
is Papilloma Virus (HPV 6a, 6b, 1 1 , 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59 and 68), in particularthose HPV serotypes 

considered to be responsible for genital warts (HPV 6 or HPV 11 and others), and the HPV viruses responsible for 

cervical cancer (HPV16, HPV18 and others). 

[0055] Suitable HPV antigens are E1, E2, E4, E5, E6, E7, L1 and L2. Particularly preferred forms of genital wart 
prophylactic, or therapeutic, fusions comprise L1 particles or capsomers, and fusion proteins comprising one or more 
20 antigens selected from the HPV 6 and HPV 1 1 proteins E6, E7, L1 , and L2. 

[0056] The most preferred forms of fusion protein are: L2E7 as disclosed in WO 96/26277, and proteinD(1/3)-E7 
disclosed in GB 9717953.5 (PCT/EP98/05285). 

[0057] A preferred HPV cervical infection or cancer, prophylaxis or therapeutic vaccine, composition may comprise 
HPV 16 or 18 antigens. For example, L1 or L2 antigen monomers, or L1 or L2 antigens presented together as a virus 

25 like particle (VLP) or the L1 alone protein presented alone in a VLP or caposmer structure. Such antigens, virus like 
particles and capsomer are perse known. See for example WO94/001 52, WO94/201 37, WO94/05792, and WO93/021 84. 
[0058] Additional early proteins may be included alone or as fusion proteins such as E7, E2 or preferably E5 for 
example; particularly preferred embodiments of this includes a VLP comprising L1 E7 fusion proteins (WO 96/1 1272). 
Particularly preferred HPV 16 antigens comprise the early proteins E6 or E7 in fusion with a protein D carrier to form 

so Protein D - E6 or E7 fusions from HPV 1 6, or combinations thereof; or combinations of E6 or E7 with L2 (WO 96/26277). 
Alternatively the HPV 16 or 18 early proteins E6 and E7, may be presented in a single molecule, preferably a Protein 
D- E6/E7 fusion. Other fusions optionally contain either or both E6 and E7 proteins from HPV 1 8, preferably in the form 
of a Protein D - E6 or Protein D - E7 fusion protein or Protein D E6/E7 fusion protein. Fusions may comprise antigens 
from other HPV strains, preferably from strains HPV 31 or 33. 

35 [0059] Fusions according to the present invention comprise antigens derived from parasites that cause Malaria. For 
example, preferred antigens from Plasmodia falciparum include RTS,S and TRAP. RTS is a hybrid protein comprising 
substantially all the C-terminal portion of the circumsporozoite (CS) protein of P. falciparum linked via four amino acids 
of the preS2 portion of Hepatitis B surface antigen to the surface (S) antigen of hepatitis B virus. Its full structure is 
disclosed in the International Patent Application No. PCT/EP92/02591 , published under Number WO 93/10152 claiming 

to priority from UK patent application No.9124390.7. When expressed in yeast RTS is produced as a lipoprotein particle, 
and when it is co-expressed with the S antigen from HBV it produces a mixed particle known as RTS.S. TRAP antigens 
are described in the International Patent Application No. PCT/GB89/00895, published under WO 90/01496. A preferred 
embodiment of the present invention is a fusion wherein the antigenic preparation comprises a combination of the RTS, 
S and TRAP antigens. Other Plasmodia antigens that are likely candidates to be components of the fusion are P. 

45 faciparumMSPI , AMA1 , MSP3, EBA, GLURP, RAP1 , RAP2, Sequestrin, PfEMPI , Pf332, LSA1 , LSA3, STARP, SALSA, 
PfEXPI, Pfs25, Pfs28, PFS27/25, Pfs16, Pfs48/45, Pfs230 and their analogues in Plasmodium spp. 
[0060] The present invention also provides a polynucleotide encoding the fusion partner according to the present 
invention. The invention further relates a polynucleotide that hybridise to the polynucleotide sequence provided herein 
in figure 1 (SEQ ID NO:9 to 16). In this regard, the invention especially relates to polynucleotides that hybridise under 

so stringent conditions to the polynucleotide described herein. As herein used, the terms "stringent conditions" and "stringent 
hybridisation conditions" mean hybridisation occurring only if there is at least 95% and preferably at least 97% identity 
between the sequences. A specific example of stringent hybridization conditions is overnight incubation at 42°C in a 
solution comprising: 50% formamide, 5x SSC ( 150mM NaCL 1 5mM trisodium citrate), 50 mM sodium phosphate (pH7.6), 
5x Denhardt's solution, 10% dextran sulfate, and 20 micrograms/ml of denatured, sheared salmon sperm DNA, followed 

55 by washing the hybridisation support in 0.1x SSC at about 65°C. Hybridisation and wash conditions are well known and 
exemplified in Sambrook, et a/., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, N.Y., 
(1989), particularly Chapter 11 therein. Solution hybridisation may also be used with the polynucleotide sequences 
provided by the invention. 
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[0061] The present invention also provides a polynucleotide encoding the polypeptide comprising the fusion partner 
according to the present invention fused to a tumour associated antigen or fragment thereof. In particular, the present 
invention provides for polynucleotide sequences encoding a fusion partner protein comprising a choline binding domain 
and a heterologous promiscuous T heper epitope, preferably wherein the choline binding domain is derived from the C 

5 terminus of LytA. In a more preferred embodiment, the C-LytA moiety of the polynucleotides according to the invention 
comprise at least four repeats of any of SEQ ID NO.9-14, more preferably comprise the sequence of SEQ ID NO. 15, 
still more preferably the sequence of SEQ ID NO. 16. In other related embodiments, the present invention provides for 
polynucleotide variants having substantial identity to the sequences disclosed herein in SEQ ID NOs:9-16, for example 
those comprising at least 70% sequence identity, preferably at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 

10 99% or higher, sequence identity compared to a polynucleotide sequence of this invention using conventional methods, 
e.g., BLAST analysis using standard parameters. In a still further embodiment the polynucleotide as claimed further 
comprises a heterologous protein. 

[0062] Such polynucleotide sequences can be inserted into a suitable expression vector and expressed in a suitable 
host. Vectors may be provided which encode the modified choline binding protein of the invention and which contain a 
is suitable restriction site into which a DNA encoding a poorly immunogenic protein can be inserted to produce a fusion 
protein. 

In other embodiments of the invention, polynucleotide sequences orfragments thereof which encode polypeptide fusions 
of the invention, may be used in recombinant DNA molecules to direct expression of a polypeptide in appropriate host 
cells, Due to the inherent degeneracy of the genetic code, other DNA sequences that encode substantially the same or 
20 a functionally equivalent amino acid sequence may be produced and these sequences may be used to clone and express 
a given polypeptide. 

[0063] As will be understood by those of skill in the art, it may be advantageous in some instances to produce polypep- 
tide-encoding nucleotide sequences possessing non-naturally occurring codons. The DNA code has 4 letters (A, T, C 
and G) and uses these to spell three letter "codons" which represent the amino acids the proteins encodes in an organism's 
25 genes. The linear sequence of codons along the DNA molecule is translated into the linear sequence of amino acids in 
the protein(s) encoded by those genes. The code is highly degenerate, with 61 codons coding for the 20 natural amino 
acids and 3 codons representing "stop" signals. Thus, most amino acids are coded for by more than one codon - in fact 
several are coded for by four or more different codons. 

[0064] Where more than one codon is available to code for a given amino acid, it has been observed that the codon 
3D usage patterns of organisms are highly non-random. Different species show a different bias in their codon selection and, 
furthermore, utilisation of codons may be markedly different in a single species between genes which are expressed at 
high and low levels. This bias is different in viruses, plants, bacteria and mammalian cells, and some species show a 
stronger bias away from a random codon selection than others. For example, humans and other mammals are less 
strongly biased than certain bacteria or viruses. For these reasons, there is a significant probability that a mammalian 
35 gene expressed in E.coli or a viral gene expressed in mammalian cells will have an inappropriate distribution of codons 
for efficient expression. It is believed that the presence in a heterologous DNA sequence of clusters of codons which 
are rarely observed in the host in which expression is to occur, is predictive of low heterologous expression levels in 
that host. 

[0065] In consequence, codons preferred by a particular prokaryotic (for example E. coli or yeast) or eukaryotic host 
40 can be optimised, that is selected to increase the rate of protein expression, to produce a recombinant RNA transcript 
having desirable properties, such as for example a half-life which is longer than that of a transcript generated from the 
naturally occurring sequence, or to optimise the immune response in humans. The process of codon optimisation may 
include any sequence, generated either manually or by computer software, where some or all of the codons of the native 
sequence are modified. Several methods have been published (Nakamura et.al., Nucleic Acids Research 1996, 24: 
45 214-215; WO98/34640). One preferred method according to this invention is Syngene method, a modification of Calcgene 
method (R. S. Hale and G Thompson (Protein Expression and Purification Vol. 12 pp.185-188 (1998)). 
[0066] Accordingly in a preferred embodiment the DNA sequence of the protein has a RSCU (Relative synomons 
Codon useage (also known as Codon Index CI)) of at least 0.65 and have less than 85% identity to the corresponding 
wild type region. 

50 [0067] This process of codon optimisation and the resulting constructs are advantageous as they may have some or 
all of the following benefits: 1 ) to improve expression of the gene product by replacing rare or infrequently used codons 
with more frequently used codons, 2) to remove or include restriction enzyme sites to facilitate downstream cloning and 
3) to reduce the potential for homologous recombination between the insert sequence in the DNA vector and genomic 
sequences and 4) to improve the immune response in humans by raising a cellular and/or an antibody response (preferably 

55 both responses) against the target antigen. The sequences of the present invention advantageously have reduced 
recombination potential, but express to at least the same level as the wild type sequences. Due to the nature of the 
algorithms used by the SynGene programme to generate a codon optimised sequence, it is possible to generate an 
extremely large number of different codon optimised sequences which will perform a similar function. In brief, the codons 
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are assigned using a statistical method to give synthetic gene having a codon frequency closer to that found naturally 
in highly expressed E.coli and human genes. In brief, the codonsare assigned using a statistical method to give synthetic 
gene having a codon frequency closerto that found naturally in highly expressed human genes such as fJ-Actin. Illustrative, 
although non limiting, examples of suitable codon-optimised sequences are given in SEQ ID NOs:19-22 and SEQ ID 
s NOs:24-26. 

[0068] In the polynucleotides of the present invention, the codon usage pattern is altered from that typical of the target 
antigen to more closely represent the codon bias of a highly expressed gene in a target organism, for example human 
6-actin. The "codon usage coefficient" is a measure of how closely the codon pattern of a given polynucleotide sequence 
resembles that of a target species. Codon frequencies can be derived from literature sources for the highly expressed 

10 genes of many species (see e.g. Nakamura et.al. Nucleic Acids Research 1996, 24:214-215). The codon frequencies 
for each of the 61 codons (expressed as the number of occurrences occurrence per 1000 codons of the selected class 
of genes) are normalised for each of the twenty natural amino acids, so that the value forthe most frequently used codon 
for each amino acid is set to 1 and the frequencies for the less common codons are scaled to lie between zero and 1 . 
Thus each of the 61 codons is assigned a value of 1 or lower for the highly expressed genes of the target species. In 

is order to calculate a codon usage coefficient for a specific polynucleotide, relative to the highly expressed genes of that 
species, the scaled value for each codon of the specific polynucleotide are noted and the geometric mean of all these 
values is taken (by dividing the sum of the natural logs of these values by the total number of codons and take the anti- 
log). The coefficient will have a value between zero and 1 and the higher the coefficient the more codons in the polynu- 
cleotide are frequently used codons. If a polynucleotide sequence has a codon usage coefficient of 1 , all of the codons 

20 are "most frequent" codons for highly expressed genes of the target species. 

[0069] According to the present invention, the codon usage pattern of the polynucleotide will preferably exclude codons 
representing < 10% of the codons used for a particular amino acid. A relative synonymous codon usage (RSCU) value 
is the observed number of codons divided by the number expected if all codons for that amino acid were used equally 
frequently. A polynucleotide of the present invention will preferably exclude codons with an RSCU value of less than 0.2 

25 in highly expressed genes of the target organism. A polynucleotide of the present invention will generally have a codon 
usage coefficient for highly expressed human genes of greater than 0.6, preferably greater than 0.65, most preferably 
greater than 0.7. Codon usage tables for human can also be found in Genbank. 
[0070] In comparison, a highly expressed beta actin gene has a RSCU of 0.747. 
[0071] The codon usage table (Table 1 ) for a homo sapiens is set out below: 

30 

Table 1. Codon usage for human (highly expressed) genes 1/24/91 (humanjiigh.cod) 
AmAcid Codon Number /1000 Fraction 



Gly 
Gly 
Gly 
Gly 
Glu 
Glu 
Asp 
Asp 



GGG 
GGA 
GGT 
GGC 
GAG 
GAA 
GAT 
GAC 



905.00 
525.00 
441.00 
1867.00 
2420.00 
792.00 
592.00 
1821.00 



18.76 
10.88 
9.14 
38.70 
50.16 
16.42 
12.27 
37.75 



0.24 
0.14 
0.12 
0.50 
0.75 
0.25 
0.25 
0.75 



GTG 
GTA 
GTT 
GTC 



1866.00 38.68 

134.00 2.78 

198.00 4.10 

728.00 15.09 



0.05 
0.07 
0.25 



GCG 
GCA 
GCT 
GCC 



652.00 
488.00 
654.00 
2057.00 



13.51 
10.12 



0.17 
0.13 
0.17 
0.53 



Arg 
Arg 



AGG 
AGA 
AGT 



512.00 
298.00 
354.00 



10.61 
6.18 
7.34 



0.18 
0.10 
0.10 
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(continued) 

AmAcid Codon Number /1000 Fraction 

Ser AGC 1171.00 24.27 0.34 

Lys AAG 2117.00 43.88 0.82 

Lys AAA 471.00 9.76 0.18 

Asn AAT 314.00 6 51 0.22 

Asn AAC 1120.00 23.22 0.78 

Met ATG 1077.00 22.32 1.00 

lie ATA 88.00 1.82 0.05 

lie ATT 315.00 6.53 0.18 

He ATC 1369,00 28.38 0.77 

Thr ACG 405.00 8.40 0.15 

Thr ACA 373.00 7.73 0.14 

Thr ACT 358.00 7.42 0.14 

Thr ACC 1502.00 31.13 0.57 

Trp TGG 652.00 13.51 1.00 

End TGA 109.00 2.26 0.55 

Cys TGT 325.00 6.74 0.32 

Cys TGC 706.00 14.63 0.68 

End TAG 42.00 0.87 0.21 

End TAA 46.00 0.95 0.23 

Tyr TAT 360.00 7.46 0.26 

Tyr TAC 1042.00 21.60 0.74 

Leu TTG 313.00 6.49 0.06 

Leu TTA 76.00 1.58 0.02 

Phe TTT 336.00 6.96 0.20 

Phe TTC 1377.00 28.54 0.80 

Ser TCG 325.00 6.74 0.09 

Ser TCA 165.00 3.42 0.05 

Ser TCT 450.00 9 33 0.13 

Ser TCC 958.00 19.86 0.28 

Arg CGG 611.00 12.67 0.21 

Arg CGA 183.00 3.79 0.06 

Arg CGT 210.00 4.35 0.07 

Arg CGC 1086.00 22.51 0.37 

Gin CAG 2020.00 41.87 0.88 

Gin CAA 283.00 5.87 0.12 

His CAT 234.00 4.85 0.21 

His CAC 870.00 18.03 0.79 

Leu CTG 2884.00 59.78 0.58 

Leu CTA 166.00 3.44 0.03 

Leu CTT 238.00 4.93 0.05 
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AmAcid 
Leu 



Codon 
CTC 



(continued) 
Number 
1276.00 



26.45 



/1000 



Fraction 
0.26 



Pro 
Pro 
Pro 
Pro 



CCG 
CCA 
CCT 
CCC 



482.00 
456.00 
568.00 
1410.00 



29.23 



9.99 
9.45 
11.77 



0.17 
0.16 
0 19 
0.48 
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[0072] A DNA sequence encoding the fusion proteins or modified choline binding protein of the present invention can 
be synthesised using standard DNA synthesis techniques, such as by enzymatic ligation as described by D.M. Roberts 
et al. in Biochemistry 1985, 24, 5090-5098, by chemical synthesis, by in vitro enzymatic polymerisation, or by PCR 
technology utilising for example a heat stable polymerase, or by a combination of these techniques. 
[0073] Enzymatic polymerisation of DNA may be carried out in vitro using a DNA polymerase such as DNA polymerase 
I (Klenow fragment) orTaq polymerase in an appropriate buffer containing the nucleoside triphosphates dATP, dCTP, 
dGTP and dTTP as required at a temperature of 10°-37°C, generally in a volume of 50pi or less. Enzymatic ligation of 
DNA fragments may be carried out using a DNA ligase such as T4 DNA ligase in an appropriate buffer, such as 0.05M 
Tris (pH 7.4), 0.01 M MgCI 2 , 0.01 M dithiothreitol, 1mM spermidine, 1mM ATP and 0.1 mg/ml bovine serum albumin, at 
a temperature of 4°C to ambient, generally in a volume of 50 jjlI or less. The chemical synthesis of the DNA polymer or 
fragments may be carried out by conventional phosphotriester, phosphate or phosphoramidite chemistry, using solid 
phase techniques such as those described in 'Chemical and Enzymatic Synthesis of Gene Fragments - A Laboratory 
Manual' (ed. H.G. Gassen and A. Lang), Verlag Chemie, Weinheim (1982), or in other scientific publications, for example 
M.J. Gait, H.W.D. Matthes, M. Singh, B.S. Sproat, and R.C. Titmas, Nucleic Acids Research, 1982, 10, 6243; B.S. 
Sproat, and W. Bannwarth, Tetrahedron Letters, 1983, 24, 5771; M.D. Matteucci and M.H. Caruthers. Tetrahedron 
Letters, 1980,21,719; M.D. Matteucci and M.H. Caruthers, Journal of the American Chemical Society, 1981, 103, 3185; 
S.P. Adams etal., Journal of the American Chemical Society, 1983, 105, 661; N.D. Sinha, J. Biemat, J. McMannus, and 
H. Koester, Nucleic Acids Research, 1984, 12, 4539; and H.W.D. Matthes etal., EMBO Journal, 1984, 3, 801. 
[0074] The process of the invention may be performed by conventional recombinant techniques such as described in 
Maniatis etal., Molecular Cloning - A Laboratory Manual; Cold Spring Harbor, 1982-1989. 
[0075] In particular, the process may comprise the steps of : 

i) preparing a replicable or integrating expression vector capable, in a host cell, of expressing a DNA polymer 
comprising a nucleotide sequence that encodes the protein or an immunogenic derivative thereof 

ii) transforming a host cell with said vector 

iii) culturing said transformed host cell under conditions permitting expression of said DNA polymer to produce said 
protein; and 

iv) recovering said protein 

[0076] The term 'transforming' is used herein to mean the introduction of foreign DNA into a host cell. This can be 
achieved for example by transformation, transfection or infection with an appropriate plasmid or viral vector using e.g. 
conventional techniques as described in Genetic Engineering: Eds. S.M. Kingsman and A.J. Kingsman; Blackwell Sci- 
entific Publications; Oxford, England, 1988. The term 'transformed' or 'transformant' will hereafter apply to the resulting 
host cell containing and expressing the foreign gene of interest. 
[0077] The expression vectors are novel and also form part of the invention. 

[0078] The replicable expression vectors may be prepared in accordance with the invention, by cleaving a vector 
compatible with the host cell to provide a linear DNA segment having an intact replicon, and combining said linear 
segment with one or more DNA molecules which, together with said linear segment encode the desired product, such 
as the DNA polymer encoding the protein of the invention, or derivative thereof, under ligating conditions. 
[0079] Thus, the DNA polymer may be performed or formed during the construction of the vector, as desired. 
[0080] The choice of vector will be determined in part by the host cell, which may be prokaryoticoreukaryoticbut are 
preferably E. coli, yeast or CHO cells. Suitable vectors include plasmids, bacteriophages, cosmids and recombinant 
viruses. Expression and cloning vectors preferably contain a selectable marker such that only the host cells expressing 
the marker will survive under selective conditions. Selection genes include but are not limited to the one encoding protein 
that confer a resistance to ampicillin, tetracyclin or kanamycin. Expression vectors also contain control sequences which 
are compatible with the designated host. For example, expression control sequences for E. coli, and more generally for 
prokaryotes, include promoters and ribosome binding sites. Promoter sequences may be naturally occurring, such as 
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the [^-lactamase (penicillinase) (Weissman 1981, In Interferon 3 (ed. L. Gresser), lactose (lac) (Chang et al. Nature, 
1977, 198: 1056) and tryptophan (tip) (Goeddel et al. Nucl. Acids Res. 1980, 8, 4057) and lambda-derived P L promoter 
system. In addition, synthetic promoters which do not occur in nature also function as bacterial promoters. This is the 
case for example for the tac synthetic hybrid promoter which is derived from sequences of the trp and lac promoters 

5 (De Boer et al., Proc. Natl Acad Sci. USA 1983, 80, 21-26). These systems are particularly suitable with E. coli. 

[0081] Yeast compatible vectors also carry markers that allow the selection of successful transformants by conferring 
prototrophy to auxotrophic mutants or resistance to heavy metals on wild-type strains. Expression control sequences 
for yeast vectors include promoters for glycolytic enzymes (Hess et al., J. Adv. Enzyme Reg. 1968, 7, 149), PH05gene 
encoding acid phosphatase, CUP1 gene, ARG3 gene, GAL genes promoters and synthetic promoter sequences. Other 

10 control elements useful in yeast expression are terminators and mRNA leader sequences. The 5' coding sequence is 
particularly useful since it typically encodes a signal peptide comprised of hydrophobic amino acids which direct the 
secretion of the protein from the cell. Suitable signal sequences can be encoded by genes for secreted yeast proteins 
such as the yeast invertase gene and the a-factor gene, acid phosphatase, killer toxin, the alpha-mating factor gene 
and recently the heterologous inulinase signal sequence derived from INU 1 A gene of Kluyveromyces marxianus. . Suitable 

is vectors have been developed for expression in Pichia pastoris and Saccharomyces cerevisiae. 

[0082] A variety of P. pastoris expression vectors are available based on various inducible or constitutive promoters 
( Cereghino and Cregg, FEMS Microbiol. Rev. 2000,24:45-66). For the production of cytosolic and secreted proteins, 
the most commonly used P. pastoris vectors contain the very strong and tightly regulated alcohol oxidase (AOX1) 
promoter. The vectors also contain the P. pastoris histidinol dehydrogenase (HIS4) gene for selection in his4 hosts. 

20 Secretion of foreign protein require the presence of a signal sequence and the S. cerevisiae prepro alpha mating factor 
signal sequence has been widly and successfully used in Pichia expression system. Expression vectors are integrated 
into the P. pastoris genome to maximize the stability of expression strains. As in S. cerevisiae, cleavage of a P. pastoris 
expression vector within a sequence shared by the host genome (AOX1 or HIS4) stimulates homologous recombination 
events that efficiently target integration of the vector to that genomic locus. In general, a recombinant strain that contains 

2$ multiple integrated copies of an expression cassette can yield more heterologous protein than single-copy strain. The 
most effective way to obtain high copy number transformants requires the transformation of Pichia recipient strain by 
the sphaeroplast technique (Cregg et all 1985, Mol.Cell.Biol. 5: 3376-3385) . 

[0083] The preparation of the replicable expression vector may be carried out conventionally with appropriate enzymes 
for restriction, polymerisation and ligation of the DNA, by procedures described in, for example, Maniatis e/ a/, cited above. 
30 [0084] The recombinant host cell is prepared, in accordance with the invention, by transforming a host cell with a 
replicable expression vector of the invention under transforming conditions. Suitable transforming conditions are con- 
ventional and are described in, for example, Maniatis era/, cited above, or "DNA Cloning" Vol. II, D.M. Glover ed., IRL 
Press Ltd, 1985. 

[0085] The choice of transforming conditions depends upon the choice of the host cell to be transformed. For example, 
35 in vivo transformation using a live viral vector as the transforming agent for the polynucleotides of the invention is 
described above. Bacterial transformation of a host such as E. coli may be done by direct uptake of the polynucleotides 
(which may be expression vectors containing the desired sequence) after the host has been treated with a solution of 
CaCI 2 (Cohen era/., Proc. Nat. Acad. Sci., 1973,69, 2110) or with a solution comprising a mixture of rubidium chloride 
(RbC1), MnCI 2 , potassium acetate and glycerol, and then with 3-[N-morpholino]-propane-sulphonic acid, RbC1 and 
40 glycerol or by electroporation. Transformation of lower eukaryotic organisms such as yeast cells in culture by direct 
uptake may be carried out for example by using the method of Hinnenetal (Proc. Natl. Acad. Sci. 1978, 75 : 1929-1933). 
Mammalian cells in culture may be transformed using the calcium phosphate coprecipitation of the vector DNA onto the 
cells (Graham & Van der Eb, Virology 1 978, 52, 546). Other methods for introduction of polynucleotides into mammalian 
cells include dextran mediated transfection, polybrene mediated transfection, protoplast fusion, electroporation, encap- 
45 sulation of the polynucleotide(s) into liposomes, and direct microinjection of the polynucleotides into nuclei. 

[0086] The invention also extends to a host cell transformed with a nucleic acid encoding the protein of the invention 
or a replicable expression vector of the invention. 

[0087] Culturing the transformed host cell under conditions permitting expression of the DNA polymer is carried out 
conventionally, as described in, for example, Maniatis ef al. and "DNA Cloning" cited above. Thus, preferably the cell is 

50 supplied with nutrient and cultured at a temperature below 50°C, preferably between 25°C and 42°C, more preferably 
between 25°C and 35°C, most preferably at 30°C. The incubation time may vary from a few minutes to a few hours, 
according to the proportion of the polypeptide in the bacterial cell, as assessed by SDS-PAGE or Western blot. 
[0088] The product may be recovered by conventional methods according to the host cell and according to the local- 
isation of the expression product (intracellular or secreted into the culture medium or into the cell periplasm). Thus, 

55 where the host cell is bacterial, such as £. coli it may, for example, be lysed physically, chemically or enzymatically and 
the protein product isolated from the resulting lysate. Where the host cell is mammalian, the product may generally be 
isolated from the nutrient medium or from cell free extracts. Where the host cell is a yeast such as Saccharomyces 
cerevisiae or Pichia pastoris, the product may generally be isolated from from lysed cells or from the culture medium, 
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and then further purified using conventional techniques. The specificity of the expression system may be assessed by 
western blot or by ELISA using an antibody directed against the polypeptide of interest. 

[0089] Conventional protein isolation techniques include selective precipitation, adsorption chromatography, and af- 
finity chromatography including a monoclonal antibody affinity column. When the proteins of the present invention are 

5 expressed with a histidine tail (His tag), they can easily be purified by affinity chromatography using an ion metal affinity 
chromatography column (IMAC) column.The metal ion, may be any suitable ion for example zinc, nickel, iron, magnesium 
or copper, but is preferably zinc or nickel. Preferably the IMAC buffer contains detergent, preferably an anionic detergent 
such as SDS, more preferably a non-ionic detergent such as Tween 80, or a zwitterionic detergent such as Empigen 
BB, as this may result in lower levels of endotoxin in the final product. 

w [0090] Further chromatographic steps include for example a Q-Sepharose step that may be operated either before of 
after the IMAC column. Preferably the pH is in the range of 7.5 to 10, more preferably from 7.5 to 9.5, optimally between 
8 and 9. 

[0091] The proteins of the invention can thus be purified according to the following protocol. After cell disruption, cell 
extracts containing the protein can be solubilised in a pH 8.5 Tris buffer containing urea (8.0 M for example), and SDS 

15 (from 0.5% to 1 % for example). After centrifugation, the resulting supernatant may then be loaded onto on to an IMAC 
(Nickel) Sepharose FF column equilibrated with a pH 8.5 Tris buffer. The column may then be washed with a high salt 
containing buffer (eg 0.75 - 1 5m NaC1, 15 mM pH 8.5 Tris buffer). The column may optionally then be washed again 
with phosphate buffer without salt. The proteins of the invention may be eluated from the column with an imidazole- 
containing buffered solution. The proteins can then be submitted to an additional chromatographic step, such as to an 

20 anion exchange chromatography (Q Sepharose for example). 

[0092] The proteins of the present invention are provided either soluble in a liquid form or in a lyophilised form, which 
is the preferred form. It is generally expected that each human dose will comprise 1 to 1000 |jig of protein, and preferably 
30-300 ng. The purification process can also include a carboxyamidation step whereby the protein is first reduced in 
the presence of Glutathion and then carboxymethylated in the presence of iodoacetamide. This step offers the advantage 

25 of controling the oxidative aggregation of the molecule with itself or with host cell protein contaminants through covalent 
bridging with disulphide bonds. 

[0093] The present invention also provides pharmaceutical and immunogenic compositions comprising a protein of 
the present invention in a pharmaceutically acceptable excipient. 

A preferred vaccine composition comprises at least a protein according to the invention. Said protein has, preferably, 
3D blocked thiol groups and is highly purified, e.g. has less than 5% host cell contamination. Such vaccine may optionally 
contain one or more other tumour-associated antigen and derivatives. For example, suitable other associated antigen 
include prostase, PAP-1 , PSA (prostate specific antigen), PSMA (prostate-specific membrane antigen), PSCA (Prostate 
Stem Cell Antigen), STEAP. 

[0094] In another embodiment, illustrative immunogenic compositions, such as for example vaccine compositions, of 

35 the present invention comprise DNA encoding one or more of the fusion polypeptides as described above, such that the 
fusion polypeptide is generated in situ. As noted above, the polynucleotide may be administered within any of a variety 
of delivery systems known to those of ordinary skill in the art. Indeed, numerous gene delivery techniques are well known 
in the art, such as those described by Rolland, Crit Rev. Therap. Drug Carrier Systems 75:143-198, 1998, and references 
cited therein. Appropriate polynucleotide expression systems will, of course, contain the necessary regulatory DNA 

40 regulatory sequences for expression in a patient (such as a suitable promoter and terminating signal). Alternatively, 
bacterial delivery systems may involve the administration of a bacterium (such as Bacillus-Calmette-Guerrin) that ex- 
presses an immunogenic portion of the polypeptide on its cell surface or secretes such an epitope. 
[0095] Therefore, in certain embodiments, polynucleotides encoding immunogenic polypeptides described herein are 
introduced into suitable mammalian host cells for expression using any of a number of known viral-based systems. In 

45 one illustrative embodiment, retroviruses provide a convenient and effective platform for gene delivery systems. A 
selected nucleotide sequence encoding a polypeptide of the present invention can be inserted into a vector and packaged 
in retroviral particles using techniques known in the art. The recombinant virus can then be isolated and delivered to a 
subject. A number of illustrative retroviral systems have been described (e.g.. U.S. Pat. No. 5,219,740; Miller and Rosman 
(1989) BioTechniques 7:980-990; Miller, A. D. (1990) Human Gene Therapy 1:5-14; Scarpa et al. (1991) Virology 180: 

50 849-852; Bums et al. (1993) Proc. Natl. Acad. Sci. USA 90:8033-8037; and Boris-Lawrie and Temin (1993) Cur. Opin. 
Genet. Develop. 3:102-109. 

[0096] In addition, a number of illustrative adenovirus-based systems have also been described. Unlike retroviruses 
which integrate into the host genome, adenoviruses persist extrachromosomally thus minimizing the risks associated 
with insertional mutagenesis (Haj-Ahmad and Graham (1986) J. Virol. 57:267-274; Bett et al. (1993) J. Virol. 67: 
55 591 1-5921 ; Mittereder et al. (1994) Human Gene Therapy 5:717-729; Seth et al. (1994) J. Virol. 68:933-940; Barr et al. 
(1994) Gene Therapy 1:51-58; Berkner, K. L. (1988) BioTechniques 6:616-629; and Rich et al. (1993) Human Gene 
Therapy 4:461-476). Since humans are sometimes infected by common human adenovirus serotypes such as AdHu5, 
a significant proportion of the population have a neutralizing antibody response to the adenovirus, which is likley to effect 
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the immune response to a heterologous antigen in a recombinant vaccine based system. Non-human primate adenoviral 
vectors such as the chimpanzee adenovirus 68 (AdC68, Fitzgerald et al. (2003) J. Immunol 170(3):1416-22)) are may 
offer an alternative adenoviral system without the disadvantage of a pre-existing neutralising antibody response. 
[0097] Various adeno-associated virus (AAV) vector systems have also been developed for polynucleotide delivery. 

5 AAV vectors can be readily constructed using techniques well known in the art. See, e.g., U.S. Pat. Nos. 5,173,414 and 
5,139,941 ; International Publication Nos. WO 92/01070 and WO 93/03769; Lebkowski et al. (1988) Molec. Cell. Biol. 8: 
3988-3996; Vincent etal. (1990) Vaccines 90 (Cold Spring Harbor Laboratory Press); Carter, B.J. (1992) Current Opinion 
in Biotechnology 3:533-539; Muzyczka, N. (1992) Current Topics in Microbiol, and Immunol. 158:97-129; Kotin, R. M. 
(1994) Human Gene Therapy 5:793-801; Shelling and Smith (1994) Gene Therapy 1:165-169; and Zhou et al. (1994) 

10 J. Exp. Med. 179:1867-1875. 

[0098] Additional viral vectors useful for delivering the nucleic acid molecules encoding polypeptides of the present 
invention by gene transfer include those derived from the pox family of viruses, such as vaccinia virus and avian poxvirus. 
By way of example, vaccinia virus recombinants expressing the novel molecules can be constructed as follows. The 
DNA encoding a polypeptide is first inserted into an appropriate vector so that it is adjacent to a vaccinia promoter and 

is flanking vaccinia DNA sequences, such as the sequence encoding thymidine kinase (TK). This vector is then used to 
transfect cells which are simultaneously infected with vaccinia. Homologous recombination serves to insert the vaccinia 
promoter plus the gene encoding the polypeptide of interest into the viral genome. The resulting TK.sup.(-) recombinant 
can be selected by culturing the cells in the presence of 5-bromodeoxy uridine and picking viral plaques resistant thereto. 
[0099] A vaccinia-based infection/transfection system can be conveniently used to provide for inducible, transient 

20 expression or coexpression of one or more polypeptides described herein in host cells of an organism. In this particular 
system, cells are first infected in vitro with a vaccinia virus recombinant that encodes the bacteriophage T7 RNA polymer- 
ase. This polymerase displays exquisite specificity in that it only transcribes templates bearing T7 promoters. Following 
infection, cells are transfected with the polynucleotide or polynucleotides of interest, driven by a T7 promoter. The 
polymerase expressed in the cytoplasm from the vaccinia virus recombinant transcribes the transfected DNA into RNA 

25 which is then translated into polypeptide by the hosttranslational machinery. The method provides for high level, transient, 
cytoplasmic production of large quantities of RNA and its translation products. See, e.g., Elroy-Stein and Moss, Proc. 
Natl. Acad. Sci. USA (1990) 87:6743-6747; Fuerst et al. Proc. Natl. Acad. Sci. USA (1986) 83:8122-8126. 
[0100] Alternatively, avipoxviruses, such as the fowlpox and canarypox viruses, can also be used to deliver the coding 
sequences of interest. Recombinant avipox viruses, expressing immunogens from mammalian pathogens, are known 

3D to confer protective immunity when administered to non-avian species. The use of an Avipox vector is particularly 
desirable in human and other mammalian species since members of the Avipox genus can only productively replicate 
in susceptible avian species and therefore are not infective in mammalian cells. Methods for producing recombinant 
Avipoxviruses are known in the art and employ genetic recombination, as described above with respect to the production 
of vaccinia viruses. See, e.g., WO 91/12882; WO 89/03429; and WO 92/03545. 

35 [0101] Any of a number of alphavirus vectors can also be used for delivery of polynucleotide compositions of the 
present invention, such as those vectors described in U.S. Patent Nos. 5,843,723; 6,01 5,686; 6,008,035 and 6,01 5,694. 
Certain vectors based on Venezuelan Equine Encephalitis (VEE) can also be used, illustrative examples of which can 
be found in U.S. Patent Nos. 5,505,947 and 5,643,576. 

[0102] The compositions of the present invention can be delivered by a number of routes such as intramuscularly, 

40 subcutaneously, intraperitonally or intravenously. 

[0103] In another embodiment of the invention, a polynucleotide is administered/delivered as "naked" DNA, for example 
as described in Ulmeretal., Science 259: 1745-1 749, 1993 and reviewed by Cohen, Science 259:1691-1692, 1993. The 
uptake of naked DNA may be increased by coating the DNA onto biodegradable beads, which are efficiently transported 
into the cells. In a preferred embodiment, the composition is delivered intradermal^. In particular, the composition is 

45 delivered by means of a gene gun (particularly particle bombardment) administration techniques which involve coating 
the vector on to a bead (eg gold) which are then administered under high pressure into the epidermis; such as, for 
example, as described in Haynes et al, J Biotechnology 44: 37-42 (1996). 

[0104] In one illustrative example, gas-driven particle acceleration can be achieved with devices such as those man- 
ufactured by Powderject Pharmaceuticals PLC (Oxford, UK) and Powderject Vaccines Inc. (Madison, Wl), some examples 

50 of which are described in U.S. Patent Nos. 5,846,796; 6,010,478; 5,865,796; 5,584,807; and EP Patent No. 0500 799. 
This approach offers a needle-free delivery approach wherein a dry powder formulation of microscopic particles, such 
as polynucleotide, are accelerated to high speed within a helium gas jet generated by a hand held device, propelling 
the particles into a target tissue of interest, typically the skin. The particles are preferably gold beads of a 0.4 - 4.0 fjon, 
more preferably 0.6 - 2.0 jxm diameter and the DNA conjugate coated onto these and then encased in a cartridge or 

55 cassette for placing into the "gene gun". 

[0105] In a related embodiment, other devices and methods that may be useful for gas-driven needle-less injection 
of compositions of the present invention include those provided by Bioject, Inc. (Portland, OR), some examples of which 
are described in U.S. Patent Nos. 4,790,824; 5,064,413; 5,312,335; 5,383,851; 5,399,163; 5,520,639 and 5,993,412. 
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[0106] It is possible for the immunogen component comprising the nucleotide sequence encoding the antigenic peptide, 
to be administered on a once off basis or to be administered repeatedly, for example, between 1 and 7 times, preferably 
between 1 and 4 times, at intervals between about 1 day and about 18 months. However, this treatment regime will be 
significantly varied depending upon the size of the patient, the disease which is being treated/protected against, the 
5 amount of nucleotide sequence administered, the route of administration, and other factors which would be apparent to 
a skilled medical practitioner. 

[0107] It is therefore another aspect of the present invention to provide for the use of a protein or a DNA encoding 
said protein, as described herein, in the manufacture of an immunogenic composition for eliciting an immune response 
in a patient. Preferably the immune response is to be elicited by sequential administration of i) the said protein followed 

10 by the said DNA sequence; or ii) the said DNA sequence followed by the said protein. More preferably the DNA sequence 
is coated onto biodegradable beads or delivered via a particle bombardment approach. Still more preferably the protein 
ios adjuvanted, preferably with a TH-1 inducing adjuvant, preferably with a CpG/QS21 based adjuvant formulation. 
[0108] The vectors which comprise the nucleotide sequences encoding antigenic peptides are administered in such 
amount as will be prophylactically or therapeutically effective. The quantity to be administered, is generally in the range 

is of one picogram to 16 milligram, preferably 1 picogram to 10 micrograms for particle-mediated delivery, and 10 micro- 
grams to 16 milligram for other routes of nucleotide per dose. The exact quantity may vary considerably depending on 
the weight of the patient being immunised and the route of administration. 

[0109] Suitable techniques for introducing the naked polynucleotide or vector into a patient also include topical appli- 
cation with an appropriate vehicle. The nucleic acid may be administered topically to the skin, or to mucosal surfaces 

20 for example by intranasal, oral, intravaginal or intrarectal administration. The naked polynucleotide or vector may be 
present together with a pharmaceutical^ acceptable excipient, such as phosphate buffered saline (PBS), DNA uptake 
may be further facilitated by use of facilitating agents such as bupivacaine, either separately or included in the DNA 
formulation. Other methods of administering the nucleic acid directly to a recipient include ultrasound, electrical stimu- 
lation, electroporation and microseeding which is described in US 5,697,901. 

25 [0110] Uptake of nucleic acid constructs may be enhanced by several known transfection techniques, for example 
those including the use of transfection agents. Examples of these agents includes cationic agents, for example, calcium 
phosphate and DEAE-Dextran and lipofectants, for example, lipofectam and transfectam. The dosage of the nucleic 
acid to be administered can be altered. 

[01 1 1] The fusion proteins and encoding polypeptides according to the invention can also be formulated as a phamaceu- 
30 tical/immunogenic composition, e.g. as a vaccine. Accordingly therefore, the present invention also provides for a phar- 
maceutical/immunogenic composition comprising a fusion protein of the present invention in a pharmaceutical^ accept- 
able excipient. Accordingly there is also provided a process for the preparation ofan immunogeniccomposition according 
to the present invention, comprising admixing the fusion protein of the invention or the encoding polynucleotide with a 
suitable adjuvant, diluent or other pharmaceutically acceptable carrier. 
35 [0112] The fusion proteins of the present invention are provided preferably at least 80% pure more preferably 90% 
pure as visualised by SDS PAGE. Preferably the proteins appear as a single band by SDS PAGE. 
[0113] Vaccine preparation is generally described in Vaccine Design ("The subunit and adjuvant approach" (eds. 
Powell M.F. & Newman M.J). (1995) Plenum Press New York). Encapsulation within liposomes is described by Fullerton, 
US Patent 4,235,877. 

40 [0114] The fusion proteins of the present invention and encoding polynucleotides are preferably adjuvanted in the 
vaccine formulation of the invention. Certain adjuvants are commercially available as, for example, Freund's Incomplete 
Adjuvantand Complete Adjuvant (Difco Laboratories, Detroit, Ml); MerckAdjuvant65 (Merck and Company, Inc., Rahway, 
NJ); AS-2 (SmithKline Beecham, Philadelphia, PA): aluminum salts such as aluminum hydroxide gel (alum) or aluminum 
phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated tyrosine; acylated sugars; cationically or 

45 anionically derivatised polysaccharides: polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and 
quil A. Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be used as adjuvants. 
[01 15] Within certain embodiments of the invention, the adjuvant composition is preferably one that induces an immune 
response predominantly of the Th1 type. High levels of Th1-type cytokines (e.g., IFN-y, TNFa, IL-2 and IL-12) tend to 
favor the induction of cell mediated immune responses to an administered antigen. In contrast, high levels ofTh2-type 

50 cytokines (e.g., IL-4, IL-5, IL-6and IL-1 0) tend to favorthe induction of humoral immune responses. Following application 
of a vaccine as provided herein, a patient will support an immune response that includes Th1- and Th2-type responses. 
Within a preferred embodiment, in which a response is predominantly Th1-type, the level of Th1-type cytokines will 
increase to a greater extent than the level of Th2-type cytokines. The levels of these cytokines may be readily assessed 
using standard assays. For a review of the families of cytokines, see Mosmann and Coffman, Ann. Rev. Immunol. 7: 

55 145-173, 1989. 

[0116] Preferred TH-1 inducing adjuvants are selected from the group of adjuvants comprising: 3D-MPL, QS21, a 
mixture of QS21 and cholesterol, and a CpG oligonucleotide or a mixture of two or more said adjuvants. Certain preferred 
adjuvants for eliciting a predominantly Th1 -type response include, for example, a combination of monophosphoryl lipid 
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A, preferably 3-de-O-acylated monophosphoryl lipid A, together with an aluminum salt. MPL® adjuvants are available 
from Corixa Corporation (Seattle, WA; see, for example, US Patent Nos. 4,436,727; 4,877,61 1 ; 4,866,034 and 4,91 2,094). 
CpG-containing oligonucleotides (in which the CpG dinucleotide is unmethylated) also induce a predominantly Th1 
response. Such oligonucleotides are well known and are described, for example, in WO 96/02555, WO 99/33488 and 
U.S. Patent Nos. 6,008,200 and 5,856,462. Immunostimulatory DNA sequences are also described, for example, by 
Sato et al., Science 273:352, 1996. Another preferred adjuvant comprises a saponin, such as Quil A, or derivatives 
thereof, including QS21 and QS7 (Aquila Biopharmaceuticals Inc., Framingham, MA); Escin; Digitonin; or Gypsophila 
or Chenopodium quinoa saponins . Other preferred formulations include more than one saponin in the adjuvant com- 
binations of the present invention, for example combinations of at least two of the following group comprising QS21, 
QS7, Quil A, (3-escin, or digitonin. 

[01 17] Alternatively the saponin formulations may be combined with vaccine vehicles composed of chitosan or other 
polycationic polymers, polyiactide and polylactide-co-glycolide particles, poly-N-acetyl glucosamine-based polymer ma- 
trix, particles composed of polysaccharides or chemically modified polysaccharides, liposomes and lipid-based particles, 
particles composed of glycerol monoesters, etc. The saponins may also be formulated in the presence of cholesterol to 
form particulate structures such as liposomes or ISCOMs. Furthermore, the saponins may be formulated together with 
a polyoxyethylene ether or ester, in either a non-particulate solution or suspension, or in a particulate structure such as 
a paucilamelar liposome or ISCOM. The saponins may also be formulated with excipients such as Carbopol R to increase 
viscosity, or may be formulated in a dry powder form with a powder excipient such as lactose. 
[0118] In one preferred embodiment, the adjuvant system includes the combination of a monophosphoryl lipid A and 
a saponin derivative, such as the combination of QS21 and 3D-MPL® adjuvant, as described in WO 94/00153, or a less 
reactogenic composition where the QS21 is quenched with cholesterol, as described in WO 96/33739. Other preferred 
formulations comprise an oil-in-water emulsion and tocopherol. Another particularly preferred adjuvant formulation em- 
ploying QS21, 3D-MPL® adjuvant and tocopherol in an oil-in-water emulsion is described in WO 95/17210. 
[0119] Another enhanced adjuvant system involves the combination of a CpG-containing oligonucleotide and a saponin 
derivative particularly the combination of CpG and QS21 as disclosed in WO 00/091 59 and in WO 00/62800. Preferably 
the formulation additionally comprises an oil in water emulsion and tocopherol. 

[0120] In a yet further embodiment the present invention provides an immunogenic composition comprising a fusion 
protein according to the invention, and further comprising D3-MPL, a saponin preferably QS21 and a CpG oligonucleotide, 
optionally formulated in an oil in water emulsion. 

[0121] Additional illustrative adjuvants for use in the pharmaceutical compositions of the invention include Montanide 
ISA 720 (Seppic, France), SAF (Chiron, California, United States), ISCOMS (CSL), MF-59 (Chiron), the SBAS series 
of adjuvants (e.g., SBAS-2 orSBAS-4, available from SmithKline Beecham, Rixensart, Belgium), Detox (Enhanzyn® 
(Corixa, Hamilton, MT), RC-529 (Corixa, Hamilton, MT) and other aminoalkyl glucosaminide 4-phosphates (AGPs).such 
as those described in pending U.S. Patent Application Serial Nos. 08/853,826 and 09/074,720, the disclosures of which 
are incorporated herein by reference in their entireties, and polyoxyethylene ether adjuvants such as those described 
inW0 99/52549A1. 

[0122] Other preferred adjuvants include adjuvant molecules of the general formula (I): 

HO(CH 2 CH 2 0) n -A-R, wherein, n is 1-50, A is a bond or-C(O)-, R is C,. 50 alkyl or Phenyl Chalky!. One embodiment 
of the present invention consists of a vaccine formulation comprising a polyoxyethylene ether of general formula (I), 
wherein n is between 1 and 50, preferably 4-24, most preferably 9; the R component is C,. 50 , preferably C 4 -C 20 alkyl 
and most preferably C 12 alkyl, and A is a bond. The concentration of the polyoxyethylene ethers should be in the range 
0.1-20%, preferably from 0.1-10%, and most preferably in the range 0.1-1%. Preferred polyoxyethylene ethers are 
selected from the following group: polyoxyethylene-9-lauryl ether, polyoxyethylene-9-steoryl ether, polyoxyethylene-8- 
steoryl ether, polyoxyethytene-4-lauryl ether, polyoxyethylene-35-lauryl ether, and polyoxyethylene-23-lauryl ether. Poly- 
oxyethylene ethers such as polyoxyethylene lauryl ether are described in the Merck index (12 lh edition: entry 7717). 
These adjuvant molecules are described in WO 99/52549. The polyoxyethylene ether according to the general formula 
(I) above may, if desired, be combined with another adjuvant. For example, a preferred adjuvant combination is preferably 
with CpG as described in the pending UK patent application GB 9820956.2. 

[0123] It is an embodiment of the invention that the antigens, including nucleic acid vector, of the invention be utilised 
with immunostimulatory agent. Preferably the immunostimulatory agent is administered at the same time as the antigens 
of the invention and in preferred embodiments are formulated together. It is another embodiment of the invention that 
the antigen and immunostimulatory agent (or vice versa) are administered sequentially to the same or adjacent sites, 
separated in time by periods of between 0-100 hours. Such immunostimulatory agents include but are not limited to: 
synthetic imidazoquinolines such as imiquimod [S-26308, R-837], (Harrison, et al., Vaccine 19: 1820-1826, 2001; and 
resiquimod [S-28463, R-848] (Vasilakos, et al., Cellular immunology 204: 64-74, 2000.; Schiff bases of carbonyls and 
amines that are constitutively expressed on antigen presenting cell and T-cell surfaces, such as tucaresol (Rhodes, J. 
et al., Nature 377: 71-75, 1995), cytokine, chemokine and co-stimulatory molecules as either protein or peptide, including 
for example pro-inflammatory cytokines such as Interferon, GM-CSF, IL-1 alpha, IL-1 beta, TGF- alpha and TGF - beta, 
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Th1 inducers such as interferon gamma, IL-2, IL-12, IL-15, IL-18and IL-21, Th2 inducers such as IL-4, IL-5, IL-6, IL-10 
and IL-13 and other chemokine and co-stimulatory genes such as MCP-1, MIP-1 alpha, MIP-1 beta, RANTES, TCA-3, 
CD80, CD86 and CD40L, other immunostimulatory targeting ligandssuch as CTLA-4 and L-selectin, apoptosisstimulating 
proteins and peptides such as Fas, (49), synthetic lipid based adjuvants, such as vaxfectin, (Reyes et al., Vaccine 19: 
3778-3786, 2001) squalene, alpha- tocopherol, polysorbate 80, DOPC and cholesterol, endotoxin, [LPS], (Beutler, B., 
Current Opinion in Microbiology 3: 23-30, 2000); CpG oligo- and di-nucleotides (Sato, Y. et al., Science 273 (5273): 
352-354, 1996; Hemmi, H. et al., Nature 408: 740-745, 2000) and other potential ligands that trigger Toll receptors to 
produce Th1 -inducing cytokines, such as synthetic Mycobacterial lipoproteins. Mycobacterial protein p1 9, peptidoglycan, 
teichoic acid and lipid A. 

[0124] Other suitable adjuvant include CT (cholera toxin, subunites A and B) and LT (heat labile enterotoxin from E. 
coli, subunites A and B), heat shock protein family (HSPs), and LLO (listeriolysin O; WO 01/72329). 
[0125] Where the immunostimulatory agent is a protein, the agent may be administered either as a protein or as a 
polynucleotide encoding the protein. 

[0126] Other suitable delivery systems include microspheres wherein the antigenic material is incorporated into or 
conjugated to biodegradable polymers/microspheres sothat the antigenic material can be mixed with a suitable phar- 
maceutical carrier and used as a vaccine. The term "microspheres" is generally employed to describe colloidal particles 
which are substantially spherical and have a diameter in the range 10 nm to 2 mm. Microspheres made from a very wide 
range of natural and synthetic polymers have found use in a variety of biomedical applications. This delivery system is 
especially advantageous for proteins having short half-lives in vivo requiring multiple treatments to provide efficacy, or 
being unstable in biological fluids or not fully absorbed from the gastrointestinal tract because of their relatively high 
molecular weights. Several polymers have been described as a matrix for protein release. Suitable polymers include 
gelatin, collagen, alginates, dextran. Preferred delivery systems include biodegradable poly(DL-lactic acid) (PLA), poly 
(lactide-co-glycolide) (PLG), poly(glycolic acid) (PGA), poly(e-caprolactone) (PCL), and copolymers poly(DL-lactic-co- 
glycolic acid) (PLGA). Other preferred systems include heterogeneous hydrogels such as poly(ether ester) multiblock 
copolymers, containing repeating blocks based on hydrophilic poly-(ethylene glycol) (PEG) and hydrophobic poly(buty- 
lene terephtalate) (PBT), or poly(ehtykene glycol)-terephtalate/poly(-butylene terephtalate) (PEGT/PBT) (Sohier et al. 
Eur. J. Pharm and Biopharm, 2003, 55, 221-228). Systems are preferred which provide a sustained release for 1 to 3 
months such as PLGA, PLA and PEGT/PBT 

[0127] It is possible for the immunogenic or vaccine composition to be administered on a once off basis or, preferably, 
to be administered repeatedly, as many times as necessary, for example, between 1 and 7 times, preferably between 
1 and 4 times, at intervals between about 1 day and about 18 months, preferably one month. This may be optionally 
followed by dosing at regular intervals of between 1 and 12 months for a period up to the remainder of the patient's life. 
In a preferred embodiment the patient receives the antigen in different forms in a "prime boost" regime. Thus for example 
the antigen, the fusion protein, is first administered as a protein adjuvant base formulation and then subsequently 
administered as a DNA based vaccine. This administration mode is preferred. The preferred adjuvant is a combination 
of a CpG-containing oligonucleotide and a saponin derivative, particularly the combination of CpG and QS21 asdisclosed 
in WO 00/091 59 and in WO 00/62800. The uptake of naked DNA may be increased by coating the DNA onto biodegradable 
beads, which are efficiently transported into the cells. Alternatively the DNA can be delivered via a particle bombardment 
approach, for example, gas-driven particle acceleration with devices such as those manufactured by Powderject Phar- 
maceuticals PLC (Oxford, UK) and Powderject Vaccines Inc. (Madison, Wl) as taught herein. This approach offers a 
needle-free delivery approach wherein a dry powder formulation of microscopic particles, such as polynucleotide or 
polypeptide particles, are accelerated to high speed within a helium gas jet generated by a hand held device, propelling 
the particles into a target tissue of interest. 

[0128] In another preferred embodiment, the DNA based vaccine will be administered first, followed by the protein 
adjuvant base formulation. Still another embodiment will concern the delivery of the DNA construct by means of spe- 
cialised delivery vectors, preferably by the means of viral system, most preferably by the means of adenoviral-based 
systems. Other suitable viral-based systems of DNA delivery include retroviral, lentiviral, adeno-associated viral, herpes 
viral and vaccinia-viral based systems. 

[0129] In another preferred embodiment, the protein adjuvant base formulation and DNA based vaccine may be co- 
administered at adjacent or overlapping sites. Dependent upon the nature of the DNA vaccine formulation, this can be 
achieved by mixing the DNA and protein adjuvant formulations prior to administration or by simultaneously administration 
of the DNA and protein adjuvant formulation. 

[0130] The treatment regime will be significantly varied depending upon the size and species of patient concerned, 
the amount of nucleic acid vaccine and / or protein composition administered, the route of administration, the potency 
and dose of any adjuvant compounds used and other factors which would be apparent to a skilled medical practitioner. 
[0131] Within further aspects, the present invention provides methods for stimulating an immune response in a patient, 
preferably a T cell response in a human patient, comprising administering a pharmaceutical composition described 
herein. The patient may be afflicted with lung or colon cancer or colorectal cancer or breast cancer, in which case the 
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methods provide treatment for the disease, or patient considered at risk for such a disease may be treated prophylactically. 
[0132] Within further aspects, the present invention provides methods for inhibiting the development of a cancer in a 
patient, comprising administering to a patient a pharmaceutical composition as recited above. The patient may be afflicted 
with, for example, sarcoma, prostate, ovarian, bladder, lung, colon, colorectal or breast cancer, in which case the methods 

5 provide treatment for the disease, or patient considered at risk for such a disease may be treated prophylactically. 
[01 33] The present invention further provides, within other aspects, methods for removing tumour cells from a biological 
sample, comprising contacting a biological sample with T cells that specifically react with a polypeptide of the present 
invention, wherein the step of contacting is performed under conditions and for a time sufficient to permit the removal 
of cells expressing the protein from the sample. 

10 [0134] Within related aspects, methods are provided for inhibiting the development of a cancer in a patient, comprising 
administering to a patient a biological sample treated as described above. 

[0135] Methods are further provided, within other aspects, for stimulating and/or expanding T cells specific for a 
polypeptide of the present invention, comprising contacting T cells with one or more of: (i) a polypeptide as described 
above; (ii) a polynucleotide encoding such a polypeptide: and/or (iii) an antigen presenting cell that expresses such a 
is polypeptide; under conditions and for a time sufficient to permit the stimulation and/or expansion of T cells. Isolated T 
cell populations comprising T cells prepared as described above are also provided. 

[0136] Within further aspects, the present invention provides methods for inhibiting the development of a cancer in a 
patient, comprising administering to a patient an effective amount of a T cell population as described above. 
The present invention further provides methods for inhibiting the development of a cancer in a patient, comprising the 
20 steps of: (a) incubating CD4+ and/or CD8+ T cells isolated from a patient with one or more of: (i) a polypeptide disclosed 
herein; (ii) a polynucleotide encoding such a polypeptide; and (iii) an antigen-presenting cell that expressed such a 
polypeptide; and (b) administering to the patient an effective amount of the proliferated T cells, and thereby inhibiting 
the development of a cancer in the patient. Proliferated cells may. but need not, be cloned prior to administration to the 
patient. 

25 [0137] According to another embodiment of this invention, an immunogenic composition described herein is delivered 
to a host via antigen presenting cells (APCs), such as dendritic cells, macrophages, B cells, monocytes and other cells 
that may be engineered to be efficient APCs. Such cells may, but need not, be genetically modified to increase the 
capacity for presenting the antigen, to improve activation and/or maintenance of the T cell response, to have anti-tumor 
effects per se and/or to be immunologically compatible with the receiver (i.e., matched HLA haplotype). APCs may 

3D generally be isolated from any of a variety of biological fluids and organs, including tumor and peritumoral tissues, and 
may be autologous, allogeneic, syngeneic or xenogeneic cells. 

[0138] Certain preferred embodiments of the present invention use dendritic cells or progenitors thereof as antigen- 
presenting cells. Dendritic cells are highly potent APCs (Banchereau and Steinman, Nature 392:245-251, 1998) and 
have been shown to be effective as a physiological adjuvant for eliciting prophylactic or therapeutic antitumor immunity 

35 (see Timmerman and Levy, Ann. Rev. Med. 50:507-529, 1999). In general, dendritic cells may be identified based on 
their typical shape (stellate in situ, with marked cytoplasmic processes (dendrites) visible in vitro), their ability to take 
up, process and present antigens with high efficiency and their ability to activate naive T cell responses. Dendritic cells 
may, of course, be engineered to express specific cell-surface receptors or ligands that are not commonly found on 
dendritic cells in vivo or ex vivo, and such modified dendritic cells are contemplated by the present invention. As an 

40 alternative to dendritic cells, secreted vesicles antigen-loaded dendritic cells (called exosomes) may be used within a 
vaccine (see Zitvogel et al., Nature Med. 4:594-600, 1998). 

[0139] Dendritic cells and progenitors may be obtained from peripheral blood, bone marrow, tumor-infiltrating cells, 
peritumoral tissues-infiltrating cells, lymph nodes, spleen, skin, umbilical cord blood or any other suitable tissue or fluid. 
For example, dendritic cells may be differentiated ex vivo by adding a combination of cytokines such as GM-CSF, IL-4, 
45 IL-13 and/or TNFa to cultures of monocytes harvested from peripheral blood. Alternatively, CD34 positive cells harvested 
from peripheral blood, umbilical cord blood or bone marrow may be differentiated into dendritic cells by adding to the 
culture medium combinations of GM-CSF, IL-3, TNFa, CD40 ligand, LPS, flt3 ligand and/or other compound(s) that 
induce differentiation, maturation and proliferation of dendritic cells. 

[0140] Dendritic cells are conveniently categorized as "immature" and "mature" cells, which allows a simple way to 
50 discriminate between two well characterized phenotypes. However, this nomenclature should not be construed to exclude 
all possible intermediate stages of differentiation. Immature dendritic cells are characterized as APC with a high capacity 
for antigen uptake and processing, which correlates with the high expression of Fey receptor and mannose receptor. 
The mature phenotype is typically characterized by a lower expression of these markers, but a high expression of cell 
surface molecules responsible for T cell activation such as class I and class II MHC, adhesion molecules (e.g., CD54 
55 and CD1 1) and costimulatory molecules (e.g., CD40, CD80, CD86 and 4-1 BB). 

[0141] APCs may generally be transfected with a polynucleotide of the invention (or portion or other variant thereof) 
such that the encoded polypeptide, or an immunogenic portion thereof, is expressed on the cell surface. Such transfection 
may take place ex vivo, and a pharmaceutical composition comprising such transfected cells may then be used for 
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therapeutic purposes, as described herein. Alternatively, a gene delivery vehicle that targets a dendritic or other antigen 
presenting cell may be administered to a patient, resulting in transfection that occurs in vivo. In vivo and ex wVotransfection 
of dendritic cells, for example, may generally be performed using any methods known in the art, such as those described 
in WO 97/24447, or the gene gun approach described by Mahvi et al., Immunology and cell Biology 75:456-460, 1997. 
5 Antigen loading of dendritic cells may be achieved by incubating dendritic cells or progenitor cells with the tumor polypep- 
tide, DNA (naked or within a plasmid vector) or RNA; or with antigen-expressing recombinant bacterium or viruses (e.g., 
vaccinia, fowlpox, adenovirus or lentivirus vectors). Prior to loading, the polypeptide may be covalently conjugated to 
an immunological partner that provides T cell help (e.g., a carrier molecule). Alternatively, a dendritic cell may be pulsed 
with a non-conjugated immunological partner, separately or in the presence of the polypeptide. 

10 

Definitions 

[0142] Also provided by the invention are methods for the analysis of character sequences or strings, particularly 
genetic sequences or encoded protein sequences. Preferred methods of sequence analysis include, for example, meth- 

15 ods of sequence homology analysis, such as identity and similarity analysis, DNA, RNA and protein structure analysis, 
sequence assembly, cladistic analysis, sequence motif analysis, open reading frame determination, nucleic acid base 
calling, codon usage analysis, nucleic acid base trimming, and sequencing chromatogram peak analysis. 
[0143] A computer based method is provided for performing homology identification. This method comprises the steps 
of: providing a first polynucleotide sequence comprising the sequence of a polynucleotide of the invention in a computer 

20 readable medium; and comparing said first polynucleotide sequence to at least one second polynucleotide or polypeptide 
sequence to identify homology. A computer based method is also provided for performing homology identification, said 
method comprising the steps of. providing a first polypeptide sequence comprising the sequence of a polypeptide of the 
invention in a computer readable medium; and comparing said first polypeptide sequence to at least one second poly- 
nucleotide or polypeptide sequence to identify homology. 

25 [0144] "Identity," as known in the art, is a relationship between two or more polypeptide sequences or two or more 
polynucleotide sequences, as the case may be, as determined by comparing the sequences. In the art, "identity" also 
means the degree of sequence relatedness between polypeptide or polynucleotide sequences, as the case may be, as 
determined by the match between strings of such sequences. "Identity" can be readily calculated by known methods, 
including but not limited to those described in (Computational Molecular Biology, Lesk, A.M., ed., Oxford University 

3D Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., Academic Press, NewYork, 
1993; Computer Analysis of Sequence Data, Part I, Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, 
1994; Sequence Analysis in Molecular Biology, von Heine, G., Academic Press, 1987; and Sequence Analysis Primer, 
Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991; and Carillo, H., and Lipman, D., SIAM J. 
Applied Math., 48: 1073 (1988). Methods to determine identity are designed to give the largest match between the 

35 sequences tested. Moreover, methods to determine identity are codified in publicly available computer programs. Com- 
puter program methods to determine identity between two sequences include, but are not limited to, the GAP program 
in the GCG program package (Devereux, J , et al., Nucleic Acids Research 12(1): 387 (1984)), BLASTP, BLASTN 
(Altschul, S.F. et al., J. Molec. Biol. 215: 403-410 (1990), and FASTA( Pearson and Lipman Proc. Natl. Acad. Sci. USA 
85; 2444-2448 (1988). The BLAST family of programs is publicly available from NCBI and other sources (BLAST Manual, 

w Altschul, S., era/., NCBI NLM NIH Bethesda, MD 20894; Altschul, S„ etal, J. Mol. Biol. 215: 403-410 (1990). The well 
known Smith Waterman algorithm may also be used to determine identity. 
[0145] Parameters for polypeptide sequence comparison include the following: 

Algorithm: Needleman and Wunsch, J. Mol Biol. 48: 443-453 (1970) 
45 Comparison matrix: BLOSSUM62 from Henikoff and Henikoff, 

Proc. Natl. Acad. Sci. USA. 89:10915-10919 (1992) 
Gap Penalty: 8 
Gap Length Penalty: 2 

A program useful with these parameters is publicly available as the "gap" program from Genetics Computer Group, 
50 Madison Wl. The aforementioned parameters are the default parameters for peptide comparisons (along with no 

penalty for end gaps). 

[0146] Parameters for polynucleotide comparison include the following: 

55 Algorithm: Needleman and Wunsch, J. Mol Biol. 48; 443-453 (1970) 

Comparison matrix: matches = +10, mismatch = 0 
Gap Penalty: 50 
Gap Length Penalty: 3 
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Available as: The "gap" program from Genetics Computer Group, Madison Wl. These are the default parameters 
for nucleic acid comparisons. 

[0147] A preferred meaning for "identity" for polynucleotides and polypeptides, as the case may be, are provided in 
s (1) and (2) below. 

(1) Polynucleotide embodiments further include an isolated polynucleotide comprising a polynucleotide sequence 
having at least a 50, 60, 70, 80, 85, 90, 95, 97 or 100% identity to any of the reference sequences of SEQ ID NO: 
9 to SEQ ID NO:16, wherein said polynucleotide sequence may be identical to any the reference sequences of SEQ 

10 ID NO:9 to SEQ ID NO: 16 or may include up to a certain integer number of nucleotide alterations as compared to 

the reference sequence, wherein said alterations are selected from the group consisting of at least one nucleotide 
deletion, substitution, including transition and transversion, or insertion, and wherein said alterations may occur at 
the 5' or 3' terminal positions of the reference nucleotide sequence or anywhere between those terminal positions, 
interspersed either individually among the nucleotides in the reference sequence or in one or more contiguous 

15 groupswithin the reference sequence, and wherein said number of nucleotide alterations is determined by multiplying 

the total number of nucleotides in any of SEQ ID NO:9 to SEQ ID NO: 16 by the integer defining the percent identity 
divided by 100 and then subtracting that product from said total number of nucleotides in any of SEQ ID NO:9 to 
SEQIDNO:16, or: 

20 

n n =sx n -(x n .y), 

wherein n n is the number of nucleotide alterations, x n is the total number of nucleotides in any of SEQ ID NO:9 to 

25 SEQ ID NO:16, y is 0.50 for 50%, 0.60 for 60%, 0.70 for 70%, 0.80 for 80%, 0.85 for 85%, 0.90 for 90%, 0.95 for 

95%, 0.97 for 97% or 1 .00 for 1 00%, and • is the symbol for the multiplication operator, and wherein any non-integer 
product of x n and y is rounded down to the nearest integer prior to subtracting it from x n . Alterations of polynucleotide 
sequences encoding the polypeptides of any of SEQ ID NO:1 to SEQ ID NO:8 may create nonsense, missense or 
frameshift mutations in this coding sequence and thereby alter the polypeptide encoded by the polynucleotide 

3D following such alterations. 

By way of example, a polynucleotide sequence of the present invention may be identical to any of the reference 
sequences of SEQ ID NO:9 to SEQ ID N0:16, that is it may be 100% identical, or it may include up to a certain 
integer number of nucleic acid alterations as compared to the reference sequence such that the percent identity is 
less than 100% identity Such alterations are selected from the group consisting of at least one nucleic acid deletion, 

35 substitution, including transition and transversion, or insertion, and wherein said alterations may occur at the 5' or 

3' terminal positions of the reference polynucleotide sequence or anywhere between those terminal positions, in- 
terspersed either individually among the nucleic acids in the reference sequence or in one or more contiguous 
groups within the reference sequence. The numberof nucleic acid alterations for a given percent identity is determined 
by multiplying the total number of nucleic acids in any of SEQ ID NO:9 to SEQ ID NO: 16 by the integer defining the 

40 percent identity divided by 100 and then subtracting that product from said total number of nucleic acids in any of 

SEQ ID N0:9-to SEQ ID NO:16, or: 



n n <;x n -<x n .y), 

45 

wherein n n is the number of nucleic acid alterations, x n is the total number of nucleic acids in any of SEQ ID NO:9 
to SEQ ID N0:1 6, y is, for instance 0.70 for 70%, 0.80 for 80%, 0.85 for 85% etc., ♦ is the symbol for the multiplication 
operator, and wherein any non-integer product of x n and y is rounded down to the nearest integer prior to subtracting 
so it from x n . 

(2) Polypeptide embodiments further include an isolated polypeptide comprising a polypeptide having at least a 
50,60, 70, 80, 85, 90, 95, 97 or 100% identity to the polypeptide reference sequence of any of SEQ ID NO:1 to SEQ 
ID NO:8, wherein said polypeptide sequence may be identical to any of the reference sequence of SEQ ID NO: to 
SEQ ID NO:8 or may include up to a certain integer number of amino acid alterations as compared to the reference 
55 sequence, wherein said alterations are selected from the group consisting of at least one amino acid deletion, 

substitution, including conservative and non-conservative substitution, or insertion, and wherein said alterations 
may occur at the amino- or carboxy-terminal positions of the reference polypeptide sequence or anywhere between 
those terminal positions, interspersed either individually among the amino acids in the reference sequence or in 
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one or more contiguous groups within the reference sequence, and wherein said number of amino acid alterations 
is determined by multiplying the total number of amino acids in any of SEQ ID NO:1 toSEQ ID NO:8 by the integer 
defining the percent identity divided by 100 and then subtracting that product from said total number of amino acids 
in any of SEQ ID NO:1 to SEQ ID NO:8, or: 



10 wherein n a is the number of amino acid alterations, x a is the total number of amino acids in SEQ ID NO:2, y is 0.50 for 
50%, 0.60 for 60%, 0.70 for 70%, 0.80 for 80%, 0.85 for 85%, 0.90 for 90%, 0.95 for 95%, 0.97 for 97% or 1 .00 for 1 00%, 
and • is the symbol for the multiplication operator, and wherein any non-integer product of x a and y is rounded down to 
the nearest integer prior to subtracting it from x a . 

[0148] By way of example, a polypeptide sequence of the present invention may be identical to the reference sequence 
15 of any of SEQ ID N0:1 to SEQ ID NO:8, that is it may be 100% identical, or it may include up to a certain integer number 
of amino acid alterations as compared to the reference sequence such that the percent identity is less than 1 00% identity. 
Such alterations are selected from the group consisting of at least one amino acid deletion, substitution, including 
conservative and non-conservative substitution, or insertion, and wherein said alterations may occur at the amino- or 
carboxy-terminal positions of the reference polypeptide sequence or anywhere between those terminal positions, inter- 
20 spersed either individually among the amino acids in the reference sequence or in one or more contiguous groups within 
the reference sequence. The number of amino acid alterations for a given % identity is determined by multiplying the 
total number of amino acids in any of SEQ ID N0:1 to SEQ ID NO:8 by the integer defining the percent identity divided 
by 100 and then subtracting that product from said total number of amino acids in any of SEQ ID NO:1 to SEQ ID NO:8, or: 

n a *x a -(x a «y). 

wherein n a is the number of amino acid alterations, x a is the total number of amino acids in any of SEQ ID NO: 1 to SEQ 
ID NO:8, y is, for instance 0.70 for 70%, 0.80 for 80%, 0.85 for 85% etc., and • is the symbol for the multiplication operator, 
30 and wherein any non-integer product of x a and y is rounded down to the nearest integer prior to subtracting it from x a . 

Figure legends 

[0149] 

35 

Figure 1 : Sequence information for C-LytA. Each repeat has been defined on the basis of both multiple sequence 

alignment and secondary structure prediction using the following alignment programs: 1) MatchBox (Depiereux E 

et al. (1992) Comput Applic Biosci 8:501-9); 2) ClustalW (Thompson JD et al. (1994) Nucl Acid Res 22:4673-80); 

3) Block-Maker (Henikoff S et al (1995) Gene 163:gc17-26) 
40 Figure 2: CPC and native Constructs (SEQ ID NOs. 27-36) 

Figure 3: Schematic structure of CPC-p501 His fusion protein expressed in S. cerevisiae 

Figure 4: Primary structure of CPC-P501 His fusion protein (SEQ ID NO.41) 

Figure 5: Nucleotide sequence of CPC P501 His(pRIT1 5201) (SEQ ID N0.42) 

Figure 6: Cloning strategy for generation of plasmid pRIT 15201 
45 Figure 7: Plasmid map of pRIT1 5201 

Figure 8: Comparative expression of CPC P501 and P501 in S. cerevisiae strain DC5 

Figure 9: Production of CPC-P501S HIS (Y1796) at small scale. Fig. 9A represents the antigen productivity as 
estimated by SDS-PAGE with silver staining: Fig. 9B represents the antigen productivity as estimated by western blot. 
Figure 10: Purification scheme of CPC-P501-His produced by Y1796. 
50 FigurejIJ^ Pattern of CPC P501 His purified protein (4-12% Novex Nu-Page polyacrylamide precasted gels). 

Figure 12: Native full-length P501 S sequence (SEQ ID NO: 17) 

Figure 13: Sequence of the CPC-P501 S expression cassette of JNW735 (SEQ ID NO:18) 
Figure 14: Two codon optimised P501S sequences (SEa ID NO: 19-20) 
Figure 15: Re-engineered codon optimised sequence 19 (SEQ ID NO:21) 
55 Figure 16: Re-engineered codon optimised sequence 20 (SEQ ID NO:22) 

Figure 17: The starting sequence for the optimisation of CPC (SEQ ID NO:23) 
Figure 18: Representative codon optimised CPC sequences (SEQ ID NO:24-25) 
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Figure 19: Engineered CPC codon optimised sequence (SEQ ID NO:26) 

Figure 20: P501S CPC fusion candidate constructs and sequences (SEQ ID NOs. 37-40 & 45-48) 

Figure 21: Western blot analysis of CHO cells following transient transfection with P501 S (JNW680), CPC-P501 S 

(JNW735) and empty vector control. 

5 Figure 22: Anti-P501S antibody responses following immunisation at dayO, 21 & 42 with pVAC-P501S (JNW680, 

mice B1-9) or Empty vector (pVAC, mice A1-6). A pre-bleed was taken at day -1 . Subsequently bleeds were taken 
at day 28 and day 49 (mice A1-3, B1-3) and day 56 (mice A4-6, B4-9). All sera was tested at 1/100 dilution. The 
results forthe pVAC immunised mice were averaged. The results for the individual pVAC-P501 S immunised mice 
are shown. Asa positive control, sera from Adeno-P501 S immunised mice (Corixa Corp, diluted 1/100) is included. 

10 Figure 23: Peptide library screen using C57BL/6 mice immunised at day 0, 21, 42, and 70 with pVAC-P501 S 

(JNW680). All peptides were used at a final concentration of 50|j.g/ml. Peptides 1-50 are overlapping 15-20mers 
obtained from Corixa. Peptides 51-70 are predicted 8-9mer Kb and Db epitopes and were ordered from Mimotopes 
(UK). Samples 71-72 and 73-78 are DMSO controls and no peptide controls respectively. Graph A shows the IFN- 
Y responses whilst Graph B shows the IL-2 responses. Peptides selected for use in subsequent immunoassays are 

15 shown in black. 

Figure 24: Cellular responses by ELISPOT at day 77 following PMID immunisation at day 0, 21, 42, and 70 with 
PVAC-P501S (JNW680, B6-9) and pVAC empty (A4-6). Peptide 18, 22 & 48 were used at 50n.g/ml. CPC-P501 S 
protein was used at 20pg/ml. Graph A shows the IFN-y responses whilst Graph B shows the IL-2 responses. 
Figure 25: Comparison of P501 S and CPC-P501 S. Cellular responses were measured by IL-2 ELISPOT using 
20 peptide 22 (1 Ofig/ml) at day 28. Mice were immunised by PMID at day 0 and 21 with pVAC empty (control), pVAC- 

P501 S (JNW680) and CPC-P501 S (JNVV735). 

Figure 26: Immune response (lymphoproliferation on spleen cells) following protein immunisation with CPC-P501 S. 
Figure 27: Evaluation of the immune response to different CPC-P501 S constructs. Cellular responses were meas- 
ured by IL-2 ELISPOT at day 28. Mice were immunised by PMID at day 0 and 21 with p7313-ie empty (control), 
25 JNW735 and CPC-P501S constructs (JNW770, 771 and 773) 

Figure 28: MUC-1 CPC sequences (SEQ ID NOs. 49 & 50) 
Figure 29: ss-CPC-MUC-1 sequences {SEQ ID NOs. 51 & 52) 

[0150] The invention will be further described by reference to the following examples: 

30 

EXAMPLE I: Preparation ot the recombinant Ycnst strain Y1796 expressing P501 Fusion Protein containing a 
C-LytA-P2-C-LytA (CPC) as fusion partner 

1. - Protein design 

35 

[0151] The structure of the fusion protein C-P2-C-p501 (alternatively named CPC-P501) to be expressed in S. cere- 
visiae is depicted in figure 3. This fusion contains the C-terminal region of gene LytA (residues 187 to 306), in which the 
P2 fragment of tetanus toxin (residues 830-843) has been inserted. The P2 fragment is placed between the residues 
277 and 278 of C-Lyt-A. The C-lytA fragment containing the P2 insertion is followed by P501 (residues amino acid 51 
40 to 553) and by the His tail. 

[0152] The primary structure of the resulting fusion protein has the sequence described in figure 4 and the coding 
sequence corresponding to the above protein design is in figure 5. 

2. - Cloning strategy for the generation of a yeast piasmid expressing CPC-P501 (51-553)-His fusion protein 

45 

[0153] 

• The starting material is the yeast vector pRIT15068 (UK patent application 0015619.0). 

This vector contains the yeast Cup1 promoter, the yeast alpha prepro signal coding sequence and the coding 
50 sequence corresponding to residues 55 to 553 of P501S followed by His tail. 

The cloning strategy outlined in figure 6 include the following steps: 

a) The first step is the insertion of P2 sequence (codon-optimised for yeast expression) in frame, inside the C- 
lytA coding sequence. The C-lytA coding sequence is harbored by piasmid pRIT 14662 (PCT/EP99/00660). 
55 The insertion is done using an adaptor formed by two complementary oligonucleotides named P21 and P22 

into the piasmid pRIT 14662 previously open by Ncol 
The sequence of P21 and P22 is: 
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P21 5' catgcaatacatcaaggctaactctaagttcatfggtatcactgaaggcgt 3' 
P22 3' gttatgtagttccgattgagattcaagtaaccatagtgacttccgcagtac 5' 



After ligation and transformation of £ co//'and transformant characterization, the plasmid named pRIT15199 is 
obtained. 

b) The second step is the preparation of C-lytA-P2-C-lytA DNA fragment by PCR amplification. The amplification 
is performed using pRIT15199 as template and the oligonucleotides named C-LytANOTATG and C-LytA-aa55. 
The sequence of both oligonucleotides being: 

C-LytANOTATG 

=5'aaaaccatggcggccgcttacgtacattccgacggctcttatccaaaagacaag 3' 
C-LytA-aa55 =5'aaacatgtacatgaacttttctggcctgtctgccagtgttc 3' 



The amplified fragment is treated with the restriction enzymes Ncol and Afl III to generate the respective cohesive 
ends. 

c) The next step is the ligation of the above fragment with vector pRIT15068 (largest fragment obtained after 
Ncol treatment) to generate the complete fusion protein coding sequence. After ligation and £. co//transformation 
the plasmid named pRIT15200 is obtained. In this plasmid the remaining unique Ncol site contains the ATG 
coding for the start codon. 

d) In the next step a Ncol fragment containing the CUP1 promoter and a portion of 2jx plasmid sequences is 
prepared from plasmid PRIT 1 5202. Plasmid pRIT 15202 is a yeast 2jx derivative containing the CUP1 promoter 
with an Ncol site at ATG ( ATG sequence: AAACC ATG ) 

e) The Ncol fragment isolated from pRIT 15202 is ligated to pRIT15200, previously open with Ncol, in the righ 
orientation, in such a way the pCUPl promoter is at the 5' side of the coding sequence. This results in the 
generation of a final expression plasmid named pRIT1 5201 (see figure 7). 

3. - Preparation of the recombinant yeast strain Y1796 (RIX4440) 

[0154] The plasmid pRIT 15201 is used to transform the S. cerevisiae strain DC 5 (ATCC 20820). After selection and 
characterisation of the yeast transformants containing the plasmid pRIT 1 5201 a recombinant yeast strain named Y1 796 
expressing CPC-P501-His fusion protein is obtained. The protein after reduction and carboxyamidation, is isolated and 
purified by affinity chromatography (IMAC) followed by anion exchange chromatography (Q Sepharose FF). 

Example II 

[0155] In analogous fashion proteins constructs as depicted in figure 2 may be expressed utilising the corresponding 
DNA sequences shown therein. In particular, yeast strain SC333 (construct 2) corresponds to Y1 796 strain but expressing 
P501 55-553 devoid of the CPC fusion partner. Yeast strain Y1 800 (construct 3) corresponds to Y 1 796 strain but additionally 
comprises the native sequence signal for P501 S (aa1 -aa34), while yeast strain Y1 802 (construct 4) comprises the alpha 
pre signal sequence upstream CPC-P501 S sequence. Yeast strain Y1 790 (construct 5) is expressing a P501 S construct 
devoid of CPC and having the alpha prepro signal sequence. 

Example III. Preparation of purified CPC-P501 

1. - Production of CPC-P501S HIS (Y1796) at small scale 

[0156] For Y1 796, in minimal medium supplemented with histidine, expression is induced in log phase by addition of 
CuS04 ranging from 100 to 500 ixM, and culture is maintained at 30°. Cells are harvested after 8 or 24H induction. 
Copper is added just before use and not mixed with medium in advance. 

[0157] For SDS PAGE analysis, yeast cells extraction is performed in citrate phosphate buffer pH4.0 + 1 30 mM NaCI. 
Extraction is performed with glass beads for small cell quantity and with French press for higher cells quantity, and then 
mixed with sample buffer and SDS-PAGE analysed. Results of comparative analysis on SDS PAGE of the different 
constructs are depicted in figure 8 and summariosed in Table 2 below. 

As shown in Table 1 below, the level of expression of the culture is much higher for Y1796 strain as compared to the 
expression level of parent strain SC333, a strain expressing the corresponding P501S-His devoid of CPC partner. 
Likewise, the presence of a signal sequence (alpha pre) does not affect the results discussed above: the level of 
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expression of the culture is much higher for Y1802 strain as compared to the expression level of corresponding strain 
Y1790, a strain expressing the corresponding P501S-His devoid of CPC partner. 



Table 2 



Recombinant 
Strain 


Plasmid 


Promotor 


Signal 
sequence 


Fusion Partner 


P501 aa 
sequences 


Expressio n 
level 


SC333 


Ma333 


CUP1 






55-553-His 


©ND 


Y1796 


pRIT 15201 


CUP 1 




CPC 


51-553- His 




Y1802 


pRIT 15219 


CUP 1 


a pre 


CPC 


51-553- His 




Y1790 


pRIT 15068 


CUP 1 


a prepro 




55-553- His 




CPC = clyta P2 clyta 

ND= not detectable, even in western blot 

+ = detectable in western blot 

+++ / ++++ = detectable in western blot and 


visible in silver stained gels 







2. - Fermentation ofY1796 (RIX4440) at larger scale 

[01S8] 100jxl of the working seed are spread on solid medium and grown for approximately 24h at 30°C. This solid 
pre-culture is then used to inoculate a liquid pre-culture in shake flasks. 

[0159] This liquid pre-culture is grown for 20h at 30°C and transferred into a 20L fermenter. The fed-batch fermentation 

includes a growth phase of about 44h and an induction phase of about 22h. 
25 [0160] The carbon source (glucose) was supplemented to the culture by a continuous feeding. The residual glucose 

concentration was maintained very low (<50mg/L) in order to minimise the ethanol production by fermentation. This was 

realised by limiting the development of the micro-organism by limited glucose feed rate. 

At the end of the growth phase, CUP1 promoter is induced by adding CuS0 4 in order to produce the antigen. 

[0161] The absence of contaminations was checked by inoculating 10 6 cells into standard TSB and THI vials supple- 
30 mented with nystatine and incubated respectively for 14 days at 20-25°C and at 30-35°C. No growth was observed as 

expected. 

3. - Antigen characterisation and productivity 

35 [0162] Cell homogenates were prepared by French pressing of fermentation samples harvested at different times 
during the induction phase and analysed by SDS-PAGE and Western Blot. It was shown that the major part of the protein 
of interest was located in the insoluble fraction obtained from the cell homogenate after centrifugation. The SDS-PAGE 
and Western Blot analyses shown in the Figures below were realised on the pellets obtained after centrifugation of these 
cell homogenates. 

40 [0163] Figures 8 A and B show a kinetics of the antigen production during the induction phase for culture PR0127. It 
appears that no antigen expression occurred during the growth phase. The specific antigen productivity seems to increase 
from the beginning of the induction phase up to 6h and then remained quite stable up to the end. But the volumetric 
productivity increased by a factor 1.5 to 2 due to biomass accumulation observed during the same period of time. The 
antigen productivity was estimated at about 500 mg per litre of fermentation broth by comparing purified reference of 

45 the antigen and crude extracts on SDS-PAGE with silver staining (figure 9A) and WB analyses using an anti-P501 S 
antibody (a murine ascite directed against P501S aa439-aa459 used at a dilution of 1/1000) (figure 9B). 

Example IV. Purification of CPC-P501 (51-5S3)-His fusion protein produced by Y1796 

50 [0164] After the cell breakage, the protein is associated with the pellet fraction. A carbamido-methylation of the molecule 
has been introduced in the process in order to cope with the oxidative aggregation of the molecule with itself or with 
host cell protein contaminants through covalent bridging with disulphide bonds. The use of detergents has also been 
required to manage the hydrophobic character of this protein (12 trans-membrane domains predicted). 
[0165] The purification protocol, developed for the scale of 1 L of culture OD (optical density) 120, is described in 

55 figure 10. All the operations are performed at room temperature (RT). 

According to DOC TCA BCA protein assay, the global purification yield is 30 - 70 mg of purified antigen / L of culture 
OD 120. The yield is linked to the level of expression of the culture and is higher as compared to the purification yield 
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of parent strain expressing unfused P501 S-His. 

The protein assay is performed as followed: proteins are first precipitated using TCA (trichloroacetic acid) in the presence 
of DOC (deoxycholate) then dissolved in a alcaline medium in the presence of SDS. The proteins then react with BCA 
(bicinchoninic acid) (Pierce) to form a soluble purple complex presenting a high adsorbance at 562 nm, which is pro- 
5 portional to the amount of proteins present in the sample. 

SDS-PAGE analysis of 3 purified bulks (figure 11) shows no difference in reducing and non reducing conditions (cf. 
ianes 2, 3 and 4 versus lanes 5, 6 and 7). The pattern consists of a major band at 70 kDa, a smear of higher MW and 
faint degradation bands. All the bands are detected by a specific anti P501 S monoclonal antibody. 

10 Example V. Vaccine preparation using CPC- P501S His protein 

[0166] The protein of Example 3 or 4 can be formulated into a vaccine containing QS21 and 3D-MPL in an oil in water 
emulsion. 

15 1. - Vaccine preparation: 

[0167] The antigen produced as shown in Example 1 to 3 a C-LytA - P2 - P501 S His. As an adjuvant, the formulation 
comprises a mixture of 3 de -O-acylated monophosphoryl lipid A (3D-MPL) and QS21 in an oil/water emulsion. The 
adjuvant system SBAS2 has been previously described WO 95/17210. 
20 [0168] 3D-MPL: is an immunostimulant derived from the lipopolysaccharide (LPS) of the Gram-negative bacterium 
Salmonella minnesota. MPL has been deacylated and is lacking a phosphate group on the lipid A moiety. This chemical 
treatment dramatically reduces toxicity while preserving the immunostimulant properties (Ribi, 1 986). Ribi Immunochem- 
istry produces and supplies MPL to SB-Biologicals. 

Experiments performed at Smith Kline Beecham Biologicals have shown that 

25 3D-MPL combined with various vehicles strongly enhances both the humoral and a TH1 type of cellular immunity. 

[0169] QS21: is a natural saponin molecule extracted from the bark of the South American tree Quillaja saponaria 
Molina. A purification technique developed to separate the individual saponins from the crude extracts of the bark, 
permitted the isolation of the particular saponin, QS21 , which is a triterpene glycoside demonstrating stronger adjuvant 
activity and lower toxicity as compared with the parent component. QS21 has been shown to activate MHC class I 

3D restricted CTLs to several subunit Ags, as well as to stimulate Ag specific lymphocytic proliferation (Kensil, 1992). Aquila 
(formally Cambridge Biotech Corporation) produces and supplies QS21 to SB-Biologicals. 

Experiments performed at SmithKline Beecham Biologicals have demonstrated a clear synergistic effect of combinations 
of MPL and QS21 in the induction of both humoral and TH1 type cellular immune responses. 

[0170] The oil/water emulsion is composed an organic phase made of of 2 oils (a tocopherol and squalene), and 
35 an aqueous phase of PBS containing Tween 80 as emulsifier. The emulsion comprised 5% squalene 5% tocopherol 
0.4% Tween 80 and had an average particle size of 180 nm and is known as SB62 (see WO 95/17210). 
Experiments performed at SmithKline Beecham Biologicals have proven that the adjunction of this O/W emulsion to 3D- 
MPL/QS21 (SBAS2) further increases the immunostimulant properties of the latter against various subunit antigens. 

40 2. - Preparation of emulsion SB62 (2 fold concentrate): 

[0171] Tween 80 is dissolved in phosphate buffered saline (PBS) to give a 2% solution in the PBS. To provide 100 
ml two fold concentrate emulsion 5g of DL alpha tocopherol and 5ml of squalene are vortexed to mix thoroughly. 90ml 
of PBS/Tween solution is added and mixed thoroughly. The resulting emulsion is then passed through a syringe and 
^5 finally microfluidised by using an M1 10S microfluidics machine. The resulting oil droplets have a size of approximately 
180 nm. 

3. - Formulations: 

50 [0172] A typical formulation containing 3D-MPL and QS21 in an oil/water emulsion is performed as follows: 20jj.g - 
25 jxg C-LytA P2-P501 S are diluted in 1 0 fold concentrated of PBS pH 6.8 and H 2 0 before consecutive addition of SB62 
(50|jJ), MPL (20(j.g), QS21 (20fxg), optionally comprising CpG oligonucleotide (100 jj.g) and 1 jj.g/ml thiomersal as 
preservative. The amount of each component may vary as necessary. All incubations are carried out at room temperature 
with agitation. 

55 
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Example VI. Codon-optimised PS01S sequences 

1. - Generation of the control recombinant plasmids: 

s [0173] Full-length P501S sequence was cloned into pVAC (Thomsen, Immunology, 1998; 95:510P105), generating 
expression plasmid JNW680. SEQ ID NO:17 represents human P501S expression cassette in the plasmid JNW680 
and is illustrated in Figure 12. The protein sequence of SEQ ID NO:17 is shown in single letter format, the start and stop 
codons being shown in bold. The Kozak sequence is denoted by the hash symbols. The codon usage index of the human 
P501 S sequence (SEQ ID NO: 17) is 0.618, as calculated by the SynGene programme. 

10 

SynGene programme 

[0174] Basically, the codons are assigned using a statistical method to give synthetic gene having a codon frequency 
closer to that found naturally in highly expressed E.coli and human genes. 

15 [0175] SynGene is an updated version of the Visual Basic program called Calcgene, written by R. S. Hale and G 
Thompson (Protein Expression and Purification Vol. 12 pp.185-188 (1998). For each amino acid residue in the original 
sequence, a codon was assigned based on the probability of it appearing in highly expressed E. coli genes. Details of 
the Calcgene program, which works under Microsoft Windows 3.1, can be obtained from the authors. Because the 
program applies a statistical method to assign codons to the synthetic gene, not all resulting codons are the most 

20 frequently used in the target organism. Rather, the proportion of frequently and infrequently used codons of the target 
organism is reflected in the synthetic sequence by assigning codons in the correct proportions. However, as there is no 
hard-and-fast rule assigning a particular codon to a particular position in the sequence, each time it is run the program 
will produce a different synthetic gene - although each will have the same codon usage pattern and each will encode 
the same amino acid sequence. If the program is run several times for a given amino acid sequence and a given target 

25 organism, several different nucleotide sequences will be produced which may differ in the number, type and position of 
restriction sites, intron splice signals etc., some of which may be undesirable. The skilled artisan will be able to select 
an appropriate sequence for use in expression of the polypeptide on the basis of these features. 
[0176] Furthermore, since the codons are randomly assigned on a statistical basis, it is possible (although perhaps 
unlikely) that two or more codons which are relatively rarely used in the target organism might be clustered in close 

3D proximity. It is believed that such clusters may upset the machinery of translation and result in particularly low expression 
rates, so the algorithm for choosing the codons in the optimized gene excludes any codons with an RSCU value of less 
than 0.2 for highly expressed genes in order to prevent any rare codon clusters being fortuitously selected. The distribution 
of the remaining codons is then allocated according to the frequencies for highly expressed E. coli to give an overall 
distribution within the synthetic gene that is typical such genes (coefficient = 0.85) and also for highly expressed human 

35 genes (coefficient = 0.50). 

Syngene (Peter Ertl, unpublished), an updated version of the Calcgene program, allows exclusion of rare codons to be 
optional, and is also used to allocate codons according to the codon frequency pattern of highly expressed human genes. 
[0177] The sequence of the CPC-P501 S cassette cloned from the vector pRIT1 5201 (see Figure 7) into pVAC, thereby 
generating plasmid JNW735, is set forth in SEQ ID NO: 18 and is illustrated in Figure 13. This sequence is identical to 

40 the pRIT1 5201 sequence with the exception of the removal of the His tag and the addition of a Kozak sequence (GCCACC) 
and appropriate restriction enzyme sites. The amino acid sequence of SEQ ID NO:18 is shown in single letter format, 
the start and stop codons are shown in bold. The boxed residues are the P2 helper epitope of tetanus toxoid. The 
underlined residues are the Clyta purification tag. The Kozak sequence is denoted by the hash symbols. 

45 2. - Generation of the recombinant plasmids with P501S codon optimised sequences: 

[0178] Although the codon coefficient index (CI) of P501S native sequence is already high (0.618), it is possible 
increase the CI value further. This will have two potential benefits - to improve the antigen expression and/or immuno- 
genicity and to reduce the possibility for recombination between the P501 S vector and genomic sequences. 
50 [0179] Using the Syngene programme, a selection (SEQ ID N0:19 to SEQ ID NO:20) of codon optimised sequences 
was obtained (Figure 14). Table 3 below shows a comparison of the codon coefficient index for the starting P501 S 
sequence and the two representative codon optimised sequences, selected on the basis of a suitable restriction enzyme 
site profile and a good CI index. 

55 Table 3- Comparison of the codon coefficient indices of two codon optimised P501 S genes 



Sequence 


Codon coefficient index (CI) 


P501 S 


0.618 
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(continued) 



Sequence 


Codon coefficient index (CI) 


SEQ ID NO:19 
SEQ ID NO:20 


0.725 
0.755 



3. Further evaluation of the codon-optimised sequences 
Sequence SEQ ID NO:19 

[0180] Although SEQ ID NO: 19 has a good CI index (0.725), it contains a doublet of rare codons at amino acids 
position 202 and 203. These codons were manually substituted with more frequent codons by changing the DNA sequence 
from TTGTTG to CTGCTG. To facilitate cloning and expression, restriction enzyme sites and a Kozak sequence were 
added. The final engineered sequence (SEQ ID NO:21) is shown in Figure 15. The Syngene programme was used to 
fragment this sequence into oligonucleotides with a minimum overlap of 19-20 bases. Therefore, Figure 15 shows the 
re-engineered P501S codon optimised SEQ ID NO. 19. Restriction enzyme sites are underlined, Kozak sequence is 
bolded, re-engineered DNA sequence to remove a rare codon doublet is boxed. 

[0181] Using a two-step PCR protocol, the overlapping primers generated by the Syngene programme were first 
assembled using a PCR Assembly protocol (detailed below). The assembly reaction generates a diverse population of 
fragments. The correct full-length fragment was recovered/amplified using the PCR recovery protocol and the terminal 
primers. The resulting PCR fragment was excised from an agarose gel, purified, restricted with Nhel and Xhol and cloned 
into pVAC. Positive clones were identified by restriction enzyme analysis and confirmed by double-stranded sequencing. 
This generates plasmid JNW766, which, due to the error-prone nature of the PCR process, contained a single silent 
mutation (C to T at position 360 of SEQ ID NO: 21 ). 

1. Assembly ieai ' ,.n CR conditi s gen eric protocol 



[0182] Reaction mix (total volume = 50|aI): 

1 x Reaction buffer (Pfx or Proofstart) 
1 fj-l Oligo pool (equal mix of all overlapping oligos) 
■ 0.5mM dNTPs 

- DNA polymerase (Pfx or Proofstart, 2.5-5U) 

- +/-1mM MgS0 4 

+/-1 x enhancer solution (Pfx enhancer or Proofstart buffer Q) 



1. 94°C for 120s (Proofstart only) 

2. 94°C for 30s 

3. 40°C for 120s 

4. 72°C for 10s 

5. 94°Cfor15s 

6. 40°C for 30s 

7. 72"C for 20s + 3s/cycie 

8. Cycle to step 5, 25 times 

9. Hold at 4°C 



2. Recovery reaction - PCR conditions (generic protocol) 

;0183] Reaction mix (total volume = 50jxl): 

1 x Reaction buffer (Pfx or Proofstart) 
5-1 0p,l assembly reaction mix 
0.3-0.75mM dNTPs 

50pmol primer (5* terminal primer, sense orientation) 
50pmol primer (3' terminal primer, anti-sense orientation) 
DNA polymerase (Pfx or Proofstart, 2.5-5U) 
+/-1 mM MgS0 4 
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+/-1x enhancer solution (Pfx enhancer or Proofstart buffer Q) 



1. 94°C 120s (Proofstart only) 

2. 94°C 45s 
5 3. 60°C 30s 

4. 72°C 120s 

5. Cycle to step 2, 25 times 

6. 72°C 240s 

7. Hold at4°C 

10 

Sequence SEQ ID NO:20 

[0184] Although SEQ ID NO: 20 has a very good CI index (0.755), it was noticed that it contained a doublet of rare 
codons at amino acids position 131 and 132. These codons were manually substituted with more frequent codons by 
is changing the DNA sequence from TTGTTG to CTGCTG. To facilitate cloning, an internal BamHI site was removed by 
mutating G to C (see the double-underlined nucleotide in Figure 16). To facilitate cloning and expression, restriction 
enzyme sites and a Kozak sequence were added. The final engineered sequence (SEQ ID NO:22) is shown in Figure 
16. The Syngene programme was used to fragment this sequence into oligonucleotides with a minimum overlap of 19-20 
bases. 

20 Figure 1 6 therefore shows the re-engineered P501 S codon optimised sequence 20 (SEQ ID NO:22). Restriction enzyme 
sites are underlined, Kozak sequence is bolded, re-engineered DNA sequence to remove a rare codon doublet is boxed 
and a silent point mutation to remove a BamHI site is double-underlined. 

[0185] Using a similar two-step PCR protocol to the one described above, full-length P501S fragment was amplified 
and cloned into pVAC. Positive clones were identified by restriction enzyme analysis and confirmed by double-stranded 
25 sequencing. This generates plasmid JNW764. The sequence of the P501S coding cassette is shown in Figure 16 (SEQ 
ID NO: 22). 

DNA Sequence similarity 

30 [0188] Pair distances following alignment by the ClustalV (weighted) method are shown in Table 3 below. Table 4 
below shows percent similarity between the starting human P501S sequence and the two codon optimised sequences 
SEQ ID NO:21 and 22 selected for further investigation. The data confirms that the codon optimised DNA sequences 
are approximately 80% similar to the original P501 S sequence. 

35 Table 4 



Example VII. Codon-optimised CPC sequences 

1. - Approach 

[01 87] Since the original CPC sequence was originally designed for optimal expression in yeast, this section describes 
the process of codon optimising for human expression. 

2. - Sequence design 

[0188] The starting sequence for the optimisation of CPC is shown in Figure 17 (SEQ ID NO: 23) This is derived 
entirely from the pRIT1 5201 and contains the entire coding sequence of CPC plus four amino acids of P501 S to facilitate 
downstream cloning. Using the Syngene programme, a selection of codon optimised sequences were obtained, from 
which representative sequences are shown in Figure 18 (SEQ ID NO: 24-25). Table 5 below shows a comparison of 
the codon coefficient index for the starting CPC sequence and the two representative codon optimised sequences. 



SEQ ID NO: 



% similarity with starting P501S sequence 



21 
22 



79.6 
79 4 



40 
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Table 5. Codon coefficient indices fortwoCPC optimised sequences 



Sequence 


Codon coefficient index (CI) 


Original CPC = SEQ ID NO:23 
SEQ ID NO:24 
SEQ ID NO:25 


0.506 
0.809 
0.800 



[0189] In addition to the codon optimisation, all sequences were also screened for restriction enzyme cloning sites. 
On the basis of the highest CI value and a favourable restriction enzyme site profile, SEQ ID NO: 24 was selected for 
construction. To facilitate cloning and expression, 5' and 3' cloning sites were added and a Kozak sequence (GCCACC) 
was inserted 5' of the initiating ATG start codon. This engineered sequence is shown in Figure 19 (SEQ ID NO:26). This 
sequence includes four amino aicds of P501 S (boxed), restriction enzyme cloning sites (Nhel and Xhol, underlined), a 
Kozak sequence (Bold), a stop codon (italicised) and 4bp of flanking irrelevant DNA to facilitate cloning. 
[0190] The Syngene programme was used to fragment this sequence into 50-60-mer oligonucleotides with a minimum 
overlap of 18-20 bases. 

Using a similar two-step PCR protocol to the one described above, the correct fragment was recovered/amplified and 
cloned into pVAC. Positive clones were identified by restriction enzyme analysis and sequence verified generating vector 
JNW759. 

4.- DNA similarity 

[0191] Pair Distances following alignment ClustalV (Weighted) are shown in Table 6 below. The table shows percent 
similarity at the DNA level between the starting sequence of CPC and the codon optimised sequence and confirms that 
the codon optimised sequences are approximately 80% similar to the original CPC sequence. 



Table 6 


Sequence SEQ ID NO: 


% similarity with starting CPC sequence 


24 


80.2 


25 


81.6 



Example VIII. Construction of the P501S fusion candidate 

35 [0192] All the candidates shown in the schematic below are codon optimised and constructed using overlapping PCR 
methodologies from plasmids JNW764 and JNW759 as templates (SEQ ID NO: 22 and SEQ ID NO: 26 respectively), 
and cloned into the expression vector p7313 ie. 

[0193] The four candidates shown schematically below are based upon CPC-P501S. Codon optimised CPC-P501S 
is construct A. Candidates B, C, D also include the sequence encoding the N terminal 50 amino acids of P501 S, positioned 
40 either at the N terminus of CPC-P501S (construct D), the C terminus of CPC-P501S (construct C), or between CPC and 
P501S (construct B). A schematic representation of the constructs is given in Figure 20. 

The nucleotide and protein sequence for each of the four constructs is shown in SEQ ID NO: 37-40 for the nucleotide 
sequences, and SEQ ID NO. 45-48 forthe corresponding polypeptide sequences. In constructs A, C and D, the underlined 
codon preferentially encodes tyrosine (either TAC or TAT) but the nucleotide sequence may be altered to encode 
45 threonine (either ACA, ACQ ACG or ACT) in construct B, the underlined codon preferentially encodes threonine (either 
ACA, ACC, ACG or ACT), but the nucleotide sequence may be altered to encode tyrosine (either TAC or TAT). In all 
constructs, the coding sequence is flanked by appropriate restriction enzyme cloning sites (in this case, Notl and BamHI), 
and a Kozak sequence immediately upstream of the initiating ATG. Table 7 below shows the plasmid identification for 
the constructs detailed above: 

so 



Table 7 



Construct 


Amino acid at underlined codon 


Sequence of codon 


Plasmid ID 


A 


Tyrosine 


TAC 


JNW771 


B 


Threonine 


ACA 


JNW773 


B 


Tyrosine 


TAC 


JNW770 
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(continued) 



Construct 


Amino acid at underlined codon 


Sequence of codon 


Plasmid ID 


C 


Tyrosine 


TAC 


JNW777 


D 


Tyrosine 


TAC 


JNW769 



[0194] The cellular responses following immunisation with p7313-ie (empty vector), pVAC-P501 S(JNW735), JNW770, 
JNW771 and JNW773 were assessed by ELISPOT following a primary immunisation by PMID at day 0 and three boosts 
10 at day 21 , 42 and 70. Assays were carried out 7 days post boost. Figure 27 shows that good IL-2 ELISPOT responses 
were detected in mice immunised with JNW770, JNW771 and JNW773. 

Example IX. Immunogenicity experiments using particle-mediated intra-dermal delivery (PMID) studies 

is [0195] Full-length P501 S, when delivered by particle mediated intra-dermal delivery (PMID), generates good antibody 
& cellular responses. These data demonstrate that the PMID is a very effective delivery route. Furthermore, comparison 
of P501 S and CPC-P501 S confirms that CPC-P501S induces a stronger immune response as determined by peptide 
ELISPOT. 

20 1.- Materials & Methods 

1.1. Cutaneous gene gun immunisation 

[0196] Plasmid DNA was precipitated onto 2jj,m diameter gold beads using calcium chloride and spermidine. Loaded 
25 beads were coated onto Tefzel tubing as described (Eisenbraumetal, 1993; Pertmeretal, 1996). Particle bombardment 
was performed using the Accell gene delivery system (PCT WO 95/19799). For each plasmid, female C57BU6 mice 
were immunised on days 0, 21 , 42 and 70. Each administration consisted of two bombardments with DNA/gold, providing 
a total dose of approximately 4-5 p.g of plasmid. 

so 1.2. ELISPOT assays for T cell responses to the P501 S gene product 

a) Preparation of splenocytes 

[0197] Spleens were obtained from immunised animals at 7-14 days post boost. Spleens were processed by grinding 
35 between glass slides to produce a cell suspension. Red blood cells were lysed by ammonium chloride treatment and 
debris was removed to leave a fine suspension of splenocytes. Cells were resuspended at a concentration of 8x1 0 8 /ml 
in RPMI complete media for use in ELISPOT assays. 

b) Screening of peptide library 

40 

[0198] A peptide library covering a majority of the P501 S sequence was obtained from Corixa Corp. The library 
contained fifty 15-20mer peptides overlapping by 4-11 amino acids peptides. The peptides are numbered 1-50. In 
addition, a prediction programme (H-G. Rammensee, etal.: Immunogenetics, 1999, 50: 213-219) (http://syfpeithi.bmi- 
heidelberg.com/ ) was used to predict putative Kb and Db epitopes from the P501 S sequence. The ten best epitopes 

45 for Kb and Db were ordered from Mimotopes (UK) and included in the library (peptides 51-70). For screening of the 
peptide library, peptides were used at a final concentration of 50(xg/ml (approx. 25-50|xM) in IFNy and IL-2 ELISPOTS 
using the protocol described below. For IFN7 ELISPOTS, IL-2 was added to the assays at 10ng/ml. Splenocytes used 
for the screening were taken at day 84 from C57BU6 mice immunised at day 0, 21, 42 and 70. Three peptides were 
identified from the library screen - Peptides 18 (HCRQAYSVYAFMISLGGCLG), 22 (GLSAPSLSPHCCPCRARLAF) and 

50 48 (VCLAAGITYVPPLLLEVGV). These peptides were subsequently used in the ELISPOT assays 

c) ELISPOT assay 

[0199] Plates were coated with 15fj.g/mi (in PBS) rat anti mouse IFNy or rat anti mouse IL-2 (Pharmingen). Plates 
55 were coated overnight at +4°C. Before use the plates were washed three times with PBS. Splenocytes were added to 
the plates at 4x10 s cells/well. Peptides identified in the library screen were re-ordered from Genemed Synthesis and 
used at a final concentration of 50fxg/ml. CPC-P501S protein (GSKBio) was used in the assay at 20(j.g/ml. ELISPOT 
assays were carried out in the presence of either IL-2 (10ng/ml), IL-7 (10ng/ml) or no cytokine. Total volume in each 
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well was 200fi.l. Plates containing peptide stimulated cells were incubated for 16 hours in a humidified 37°C incubator. 
e) Development of ELI SPOT assay plates. 

5 [0200] Cells were removed from the plates by washing once with water (with 1 0 minute soak to ensure lysis of cells) 
and three times with PBS. Biotin conjugated rat anti mouse IFNg or IL-2 (Phamingen) was added at Vg/ml in PBS. 
Plates were incubated with shaking for 2 hours at room temperature. Plates were then washed three times with PBS 
before addition of Streptavidin alkaline phosphatase (Caltag) at 1/1000 dilution. Following three washes in PBS spots 
were revealed by incubation with BCICP substrate (Biorad) for 1 5-45 mins. Substrate was washed off using water and 

10 plates were allowed to dry. Spots were enumerated using an image analysis system devised by Brian Hayes, Asthma 
Cell Biology unit, GSK. 

1.3. ELISA assay for antibodies to the P501 S gene product 

15 [0201] Serum samples were obtained from the animals by venepuncture on days -1 , 28, 49 and 56, and assayed for 
the presence of anti-P501S antibodies. ELISA was performed using Nunc Maxisorp plates coated overnight at 4°C with 
0.5(xg/ml of CPC-P501S protein (GSKBio) in sodium bicarbonate buffer. After washing with TBS-Tween {Tris-buffered 
saline, pH 7.4 containing 0.05 % of Tween 20) the plates were blocked with Blocking buffer (3% BSA in TBS-Tween 
buffer) for 2hrs at room temperature. All sera were incubated at 1 : 1 00 dilution for 1 hr at RT in Blocking buffer. Antibody 

20 binding was detected using HRP-conjugated rabbit anti-mouse immunoglobulins (#P0260, Dako) at 1:2000 dilution in 
Blocking buffer. Plates were washed again and bound conjugate detected using Fast OPD colour reagents (Sigma, UK). 
The reaction was stopped by the addition of 3M sulphuric acid, and the OPD product quantitated by measuring the 
absorbance at 490nm. 

2$ 1.4. Transient transfection assays 

[0202] Human P501 S expression from various DNA constructs was analysed by transient transfection of the plasmids 
into CHO (Chinese hamster ovary) cells followed by Western blotting on total cell protein. Transient transfections were 
performed with the Transfectam reagent (Promega) according to the manufacturer's guidelines. In brief, 24-well tissue 

so culture plates were seeded with 5x1 0 4 CHO cells per well in 1 ml DMEM complete medium (DMEM, 10% FCS, 2mM L- 
glutamine, penicillin l00IU/ml, streptomycin 100n.g/ml) and incubated for 16 hours at 37°C. 0.5fxg DNA was added to 
25(jU of 0.3M NaCI (sufficient for one well) and 2jJ of Transfectam was added to 25pJ of Milli-G. The DNA and Transfectam 
solutions were mixed gently and incubated at room temperature for 15 minutes. During this incubation step, the cells 
were washed once in PBS and covered with 150(J of serum free medium (DMEM, 2mM L-glutamine). The DNA-Trans- 

35 fectam solution was added drop wise to the cells, the plate gentle shaken and incubated at 37°C for 4-6 hours. 500fxl 
of DMEM complete medium was added and the cells incubated for a further 48-72 hours at 37°C. 

2. Western blot analysis of CHO cells transiently transfected with P501S plasmids 

40 [0203] The transiently transfected CHO cells were washed with PBS and treated with a Versene (1 :5000)/0.025% 
trypsin solution to transfer the cells into suspension. Following trypsinisation, the CHO cells were pelleted and resus- 
pended in 50jxl of PBS. An equal volume of 2x NP40 lysis buffer was added and the cells incubated on ice for 30 minutes. 
100(1.1 of 2x TRIS-Glycine SDS sample buffer (Invitrogen) containing 50mM DTT was added and the solution heated to 
95X for 5 minutes. 1-20(xl of sample was loaded onto a 4-20% TRIS-Glycine Gel 1.5mm (Invitrogen) and electrophoresed 

45 at constant voltage (125V) for 90 minutes in 1x TRIS-Glycine buffer (Invitrogen). A pre-stained broad range marker (New 
England Biolabs, #P7708S) was used to size the samples. Following electrophoresis, the samples were transferred to 
Immobilon-P PVDF membrane (Millipore), pre-wetted in methanol, using an Xcell III Blot Module (Invitrogen), 1 x Transfer 
buffer (Invitrogen) containing 20% methanol and a constant voltage of 25V for 90 minutes. The membrane was blocked 
overnight at 4°C in TBS-Tween (Tris-buffered saline, pH 7.4 containing 0.05 % of Tween 20) containing 3% dried skimmed 

50 milk (Marvel). The primary antibody (10E3) was diluted 1:1000 and incubated with the membrane for 1 hour at room 
temperature. Following extensive washing in TBS-Tween, the secondary antibody (HRP-conjugated rabbit anti-mouse 
immunoglobulins (#P0260, Dako)) was diluted 1 :2000 in TBS-Tween containing 3% dried skimmed milk and incubated 
with the membrane for one hour at room temperature. Following extensive washing, the membrane was incubated with 
Supersignal West Pico Chemiluminescent substrate (Pierce) for 5 minutes. Excess liquid was removed and the membrane 

55 sealed between two sheets of cling film, and exposed to Hyperfilm ECL film (Amersham-PharmaciaBiotech) for 1-30 
minutes. 
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3. Generation of the Full-length human P501S expression cassette 

[0204] The starting point for the construction of a P501S expression cassette was the plasmid pcDNA3.1-P501 S 
(Corixa Corp), which has a pcDNA3.1 backbone (Invitrogen) containing a full-length human P501 S cDNA cassette 

5 cloned between the EcoRI and Notl sites. This vector is also termed JNW673. The presence of P501S was confirmed 
by fluorescent sequencing. The sequence of the cDNA cassette is given by the NCBI/Genbank sequence (accession 
number AY033593). Human P501 S was PCR amplified from JNW673 template DNA, restricted with Xbal and Sail and 
cloned into the Nhel/Xhol sites of pVAC generating vector JNW680. The correct orientation of the fragment relative to 
the CMV promoter was confirmed by PCR and by DNA sequencing. The sequence of the expression cassette is shown 

10 in Figure 12 (SEQ ID NO: 17). 

To construct a CPC-P501 S expression cassette, CPC-P501 S was PCR amplified from the vector pRIT15201 (see 
Figure 7), restricted with Xbal and Sail and cloned into the Nhel and Xhol sites of pVAC, generating plasmid JNW735. 
The correct orientation was confirmed by PCR and sequencing. The sequence of the CPC-P501 S expression cassette 
is shown in Figure 13 (SEQ ID NO: 18). 

15 

4. Expression of human P501S from plasmids JNW680 and JNW735 

[0205] The P501S expression plasmids were transiently transfected into CHO cells and a total cell lysate prepared 
as described in methods. A Western blot of a total cell lysate identified single bands of approximately 55kDa and 62kDa 
20 for samples transfected with JNW680 and JNVV735 respectively (Figure 21 ). This is consistent with the predicted mo- 
lecular weights of 59.3kDa and 63.3kDa for P501 S and CPC-P501 S respectively. The addition of the CPC tag does 
not adversely affect the expression of P501 S. 

5. Results 

25 

5.1. Antibody responses to human P501S following PMID Immunisation 

[0206] The antibody responses following immunisation with pVAC (empty vector) and pVAC-P501 S ( JNW680) were 
assessed by ELISA following a primary immunisation by PMID at day 0 and three boosts at day 21 and day 42 and day 
3D 70. Figure 22 shows the antibody responses from sera taken at day -1 , day 28 and day 49 (mice A1-3, B1-3) and day 
56 (mice A4-6, B4-9). Whilst there were some non-specific responses to the pVAC empty vector, specific responses to 
the P501 S construct were seen in 5 of 9 mice. 

5.2. Identification of novel T cell epitopes from human P501S in C57BU6 mice by screening of a P501S peptide 
35 library 

[0207] Following immunisation with JNW680 (pVAC-P501S) by PMID at day 0 and three boosts at day 21 and day 
42 and day 70, ELISPOT assays were carried out at day 84. Peptides from the P501 S library were tested at 50p,g/ml 
final concentration. From this initial screen, three peptides were found to stimulate IFNy and/or IL-2 secretion. Peptides 
40 18, 22 and 48 (Figure 23). These peptides were used in subsequent cellular assays. 

5.3. Cellular responses to pVAC-P501S (JNW680) following PMID immunisation 

[0208] The cellular responses following immunisation with pVAC (empty vector) and pVAC-P501 S were assessed 
^5 by ELISPOT following a primary immunisation by PMID at day 0 and three boosts at day 21 , 42 and 70. Assays were 

carried out 7 days post boost. Two different assay conditions were used: 1) Peptides 18, 22 and 48 identified in the 

peptide library screen used at 50|ig/ml final concentration and 2) CPC-P501S protein used at 20p.g/ml final concentration. 

Figure 24A shows that whilst there were no P501 S-specfic responses to the empty vector (A4-6), the pVAC-P501 S 

construct induced specific IFN-y responses to Peptides 18 and 22 in all mice (B6-9) whilst one mouse (B7) also showed 
so an IFN-y response to Peptide 48. Figure 24B shows that all mice showed specific IL-2 responses to Peptides 18, 22 and 

48. Furthermore, pVAC-P501 S immunised mice (B6-9) also showed moderate IL-2 responses to CPC-P501S, whereas 

the empty vector immunised mice (A4-6) showed no responses. 

5.4. Comparison of cellular responses to P501S and CPC-P501S following PMID immunisation. 

[0209] The cellular responses following immunisation with pVAC (empty vector), pVAC-P501S (JNW680) and CPC- 
P501S (JNW735) were assessed by ELISPOT following a primary immunisation by PMID at day 0 and boosts at day 
21 and 42. Assays were carried out 7 days post boost. Two different assay conditions were used: 1) Peptides 18, 22 
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and 48 identified in the peptide library screen used at 50>g/ml final concentration and 2) CPC-P501 S protein used at 
20(xg/ml final concentration. Figure 25 shows that at day 28, CPC-P501 S induced good IL-2 responses to 10jxg/ml of 
peptide 22, whilst there were no P501 S-specific responses to either the empty vector or the pVAC-P501 S. These 
results were also seen using CPC-P501 S protein to re-stimulated the splenocytes. At day 49 (post 2 nd boost), the 
5 responses induced by P501 S and CPC-P501 S were equivalent. These data suggest that the addition of the CPC tag 
improves the kinetics and/or magnitude of the response to P501 S. 

Example IX. Immunogenicity experiments in mice usin4 PS01S Protein + adjuvant studies 

10 1, Design and adjuvant formulation 

[0210] The immune response induced by vaccination using the recombinant purified CPC-P501 S protein formulated 
in adjuvants is characterized in experiments performed in mice. 

Groups of 5 to 10, eight weeks old female C57BL6 mice are vaccinated, 2-6 times intramuscularly at 2 weeks intervals 
15 with 10(j,g of the CPC-P501S protein formulated in different adjuvant systems. The volume administered corresponds 
to 1/1 0 th of a human dose (50 jd). 

The serology (total Ig response) and cellular response (T eel! lymphoproliferation and cytokine production) are analyzed 
on spleen cells, 6-14 days after the last vaccination using standard protocols as described in Gerard, c. et al, 2001, 
Vaccine 19, 2583-2589. 

20 [0211] The data of one representative experiment is shown. It included 5 groups of eight C57BI/6 female mice which 
received 4 intramuscular injections of CPC P501 (10>g) + adjuvant (A, B, C) at days 0, 14, 28, 42. Example V provides 
an experimental protocol of how to carry out the formulations. Briefly the adjuvant formulations are as follows (quantities 
given for one dose of 100>l)): 

25 - Adjuvant A: QS21 (1G>g), MPL (10(xg) and CpG7909 (100 |j.g) made according to the method disclosed in WO 
00/62800; 

Adjuvant B: formulation of QS21 (20|j.g), MPL (20>g), CpG7909 (100 |j.g) and 50 pif SB62 oil-in-water emulsion 
(WO 95/17210); 

- Adjuvant C: formulation of QS21 (10(xg), MPL (10|jig), CpG7909 (100 fig) and 10 p,l SB62 oil-in-water emulsion 
so (WO 99/12565). 

2. Serology 

[0212] The total Ig response induced by vaccination was measured by ELISA using either the CPC-P501 or RA12 
35 -P501 (C term, which is a truncated form of the P501 protein corresponding to the C terminus of the protein fused at its 
N terminus, to a TB derived protein RA12 - Ra12 is derived from MTB32A antigen described in Skeiky et al., Infection 
and Immun. (1999) 67:3998-4007). 

The adjuvanted CPC-P501 S proteins give a good antibody response after vaccination. 
40 3. Cellular response 

3.1. Lymphoproliferation 

[0213] 7 days after the latest vaccine, lymphoproliferation was performed on spleen cells individually. 2.10e5 spleen 
45 cells were plated in quadruplicate, in 96 well microplate, in RPMI medium containing 1% normal mice serum. After 72 
hours of re-stimulation with either the immunogen (CPC-P501) or the truncated protein (RA12 P501) at different 
concentration , 1 ixCi 3H thymidine (Amersham 5Ci/ml) was added. After 16 hours, cells were harvested onto filter plates. 
Incorporated radioactivity was counted in a (J counter. Results are expressed in CPM or as stimulation indexes* (geomean 
CPM in cultures with antigen / geomean CPM in cultures without antigen). 
50 Re-stimulation with ConA (2jj,g/ml) as positive control was included as positive control. 

[0214] As shown in Figure 26, a P501 specific lymphoproliferation is seen in the spleen of all groups of mice receiving 
the adjuvanted protein after in vitro re-stimulation with either the immunogen or another P501 protein made in another 
expression system (£ coli), indicating that T cells have been primed in vivo by the vaccination. 

55 3.2. IFNg production measured by intracellular staining of spleen cells 



[0215] Bone Marrow Dendritic Cells (BMDC) obtained after culture of mouse PBL for 7 days in the presence of GMCSF.. 
7 days after the latest vaccine, spleen or PBL are collected and a cell suspension prepared. 10e6 cells (1 pool per group) 
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were incubated +/-18hrs with 10e5 BMDC pulsed overnight with 10n.g/ml of either the CPCp501 protein or the RA12. 
After a treatment with the 2.4.G.2 antibody, spleen cells were stained with fluorescent anti CD4 and CD8 antibodies, 
(anti CD4-APC and an anti CDBPerCP). After a permeabilization and fixation step, cells were stained with a fluorescent 
anti IFNg-FITC antibody. 

5 [0216] In mice vaccinated with CPC P501 in different adjuvant, both CD4 and CD8 T cells are shown to produce IFNg 
in response to DC pulsed with either the immunogen and the C-term p501 made in E coli ( as shown by intracellular 
straining of spleen and PBLs). There is an increase of 4-1 OX in the % of cells making this cytokine in the groups receiving 
the adjuvanted CPC-P501 S compared to the protein alone, and between 0.1 to 10% of CD4 or CD8 T cells are shown 
to produce IFNg. 

10 [0217] In conclusion, these data allow to conclude that the adjuvanted CPC-P501 protein is immunogenic in mice. 
Both a P501 specific humoral and cellular responses including IFNg production by CD4 and CD8 T cells can be detected 
after several intramuscular vaccination with CPC P501 in adjuvants. 

Example X. CPC-MUC-1 constructs and sequences 

15 

[0218] CPC sequence is taken from nucleotide SEQ ID NO. 28. 

MUC1 sequence is available from Genbank database (accession number NM_002456). 

1, W1UC1-CPC construct 

20 

[0219] Due to the presence of a signal sequence in MUC1 that is cleaved post-translationally, the CPC motif was 
placed at the C-terminus. The resulting MUC1-CPC DNA sequence is depicted in SEQ ID NO. xx (figure 28A) and the 
corresponding MUC1-CPC protein sequence in SEQ ID NO. yy (figure 28B). 

25 2. ss-CPC-MUC1 construct 

[0220] Due to the presence of a signal sequence in MUC1 that is cleaved post-translationally, the MUC1 signal 
sequence was replaced by a heterologous leader sequence (from the human immunoglobulin heavy chain) and the 
CPC motif was inserted between the heterologous leader sequence and the MUC1 sequence, generating a sequence 
so termed ss-CPC-MUC1 as depicted in figure 29. 

SEQUENCE LISTING 

[0221] 

35 

<1 10> GlaxosmithKline Biologicals sa 
Glaxo Group Ltd 

<120> Immunogenic compositions 

40 

<130> B45311 
<160> 52 

45 <1 70> FastS EQ for Windows version 4.0 

<210> 1 
<211> 15 
<212> PRT 
50 <213> Streptococcus pneumoniae 

<400> 1 

Gly Trp Gin Lys Asn Asp Thr Gly Tyr Trp Tyr val His Ser Asp 
55 1 5 10 15 



=210>2 



EP1 511 768 B1 



<211>21 
<212> PRT 

<213> Streptococcus pneumoniae 
<400> 2 

Gly Ser Tyr Pro Lys Asp Lys Phe Glu Lys lie Asn Gly Thr Trp Tyr 

IS 10 15 

Tyr Phe Asp Ser Ser 
20 



<210>3 
<211> 22 
<212> PRT 

<213> Streptococcus pneumoniae 
<400> 3 



Gly Tyr Met Leu Ala Asp Arg Trp Arg Lys His Thr Asp Gly Asn Trp 

15 10 15 

Tyr Trp Phe Asp Asn Ser 
20 



<210>4 
<211>20 
<212> PRT 

<213> Streptococcus pneumoniae 
<40O 4 



Gly Glu Met Ala Thr Gly Trp Lys Lys lie Ala Asp Lys Trp Tyr Tyr 

15 10 15 

Phe Asn Glu Glu 
20 



<210>5 
<211>21 
<212> PRT 

<213> Streptococcus pneumoniae 
<400> 5 



Gly Ala Met Lys Thr Gly Trp Val Lys Tyr Lys Asp Thr Trp Tyr Tyr 

15 10 15 

Leu Asp Ala Lys Glu 
20 



<210>6 
<211>23 
<212> PRT 

<213> Streptococcus pneumoniae 
<400> 6 



37 
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Gly Ala Met Val Ser Asn Ala Phe lie Gin Ser Ala Asp Gly Thr Gly 

1 5 10 15 

Trp Tyr Tyr Leu Lys Pro Asp 



<210>7 
<211> 142 
<212> PRT 

<213> Streptococcus pneumoniae 
<400> 7 



Gly 


Trp 


Gin 


Lys 


Asn 


Asp Thr Gly 


Tyr Trp Tyr Val 


His 


Ser Asp 


Gly 


1 








5 




10 




15 




Ser 


Tyr 


Pro 


Lys 


Asp 


Lys Phe Glu 


Lys He Asn Gly 


Thr 


Trp Tyr 


Tyr 








20 






25 




30 




Phe 


Asp 


Ser 


Ser 


Gly 


Tyr Met Leu 


Ala Asp Arg Trp 


Arg 


Lys His 


Thr 






35 






40 




45 






Asp 


Gly 


Asn 


Trp 


Tyr 


Trp Phe Asp 


Asn Ser Gly Glu 


Met 


Ala Thr 


Gly 




50 








55 


60 








Trp 


Lys 


Lys 


He 


Ala 


Asp Lys Trp 


Tyr Tyr Phe Asn 


Glu Glu Gly 


Ala 


65 










70 


75 






80 


Met 


Lys 


Thr 


Gly 


Trp 


Val Lys Tyr 


Lys Asp Thr Trp 


Tyr 


Tyr Leu 


Asp 










85 




90 




95 




Ala 


Lys 


Glu 


Gly 


Ala 


Met Val Ser 


Asn Ala Phe He 


Gin 


Ser Ala 


Asp 








100 






105 




110 




Gly 


Thr 


Gly 


Trp 


Tyr 


Tyr Leu Lys 


Pro Asp Gly Thr 


Leu 


Ala Asp 


Arg 






115 






120 




125 






Pro 


Glu 


Phe 


Thr 


Val 


Glu Pro Asp 


Gly Leu He Thr 


val 


Lys 






130 








135 


140 









<210>8 
<211> 112 
<212> PRT 

<213> Streptococcus pneumoniae 
<400> 8 



Tyr Val His Ser Asp Gly Ser Tyr Pro Lys Asp Lys Phe Glu Lys He 

15 10 15 

Asn Gly Thr Trp Tyr Tyr Phe Asp Ser Ser Gly Tyr Met Leu Ala Asp 

20 25 30 

Arg Trp Arg Lys His Thr Asp Gly Asn Trp Tyr Trp Phe Asp Asn Ser 
35 40 45 



Gly Glu Met Ala Thr Gly Trp Lys Lys He Ala Asp Lys Trp Tyr Tyr 

50 55 60 

Phe Asn Glu Glu Gly Ala Met Lys Thr Gly Trp Val Lys Tyr Lys Asp 
65 70 75 80 

Thr Trp Tyr Tyr Leu Asp Ala Lys Glu Gly Ala Met Val Ser Asn Ala 

85 90 95 

Phe He Gin Ser Ala Asp Gly Thr Gly Trp Tyr Tyr Leu Lys Pro Asp 
100 105 110 
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<210>9 
<211>45 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 9 

ggctggcaga agaatgacac tggctactgg tacgtacatt cagac 

<210> 10 
<211> 63 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 10 



<210> 11 
<211>66 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 1 1 

ggctatatgc ttgcagaccg ctggaggaag cacacagacg gcaactggta ctggttcgac 60 
aactca 66 



<210> 12 
<211> 60 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 12 

ggcgaaatgg ctacaggctg gaagaaaatc gctgataagt ggtactattt caacgaagaa 60 

<210> 13 
<211> 63 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 13 

ggtgccatga agacaggctg ggtcaagtac aaggacactt ggtactactt agacgctaaa 60 
gaa 63 

<210> 14 
<211>69 
<212> DNA 

<213> Streptococcus pneumoniae 



<400> 14 
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ggcgccatgg tatcaaatgc ctttatccag tcagcggacg gaacaggctg gtactacctc 60 



<210> 15 
<211>429 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 15 



ggctggcaga agaatgacac tggctactgg 
gacaagtttg agaaaatcaa tggcacttgg 
gcagaccgct ggaggaagca cacagacggc 
atggctacag gctggaagaa aatcgctgat 
atgaagacag gctgggtcaa gtacaaggac 
gccatggtat caaatgcctt tatccagtca 
ccagacggaa cactggcaga caggccagaa 
gtaaaataa 



tacgtacatt cagacggctc ttatccaaaa 60 

tactactttg acagttcagg ctatatgctt 120 

aactggtact ggttcgacaa ctcaggcgaa 180 

aagtggtact atttcaacga agaaggtgcc 240 

acttggtact acttagacgc taaagaaggc 300 

gcggacggaa caggctggta ctacctcaaa 360 

ttcacagtag agccagatgg cttgattaca 420 
429 



<210> 16 
<211> 336 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 16 



tacgtacatt ccgacggctc ttatccaaaa 
tactactttg acagttcagg ctatatgctt 
aactggtact ggttcgacaa ctcaggcgaa 
aagtggtact atttcaacga agaaggtgcc 
acttggtact acttagacgc taaagaaggc 
gcggacggaa caggctggta ctacctcaaa 



gacaagtttg agaaaatcaa tggcacttgg 60 

gcagaccgct ggaggaagca cacagacggc 120 

atggctacag gctggaagaa aatcgctgat 180 

atgaagacag gctgggtcaa gtacaaggac 240 

gccatggtat caaatgcctt tatccagtca 300 

ccagac 336 



<210> 17 
<211> 1674 
<212> DNA 
<213> Homo sapiens 

<400> 17 
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gccaccatgg tccagaggct gtgggtgagc 
ttgctggtca acctgctaac ctttggcctg 
gtgccgcctc tgctgctgga agtgggggta 
attggtccag tgctgggcct ggtctgtgtc 
cgtggacgct atggccgccg ccggcccttc 
agcctctttc tcatcccaag ggccggctgg 
cccctggagc tggcactgct catcctgggc 
tgcttcactc cactggaggc cctgctctct 
caggcctact ctgtctatgc cttcatgatc 
cctgccattg actgggacac cagtgccctg 
ctctttggcc tgctcaccct catcttcctc 
gaggaggcag cgctgggccc caccgagcca 
ccccactgct gtccatgccg ggcccgcttg 
cggctgcacc agctgtgctg ccgcatgccc 
ctgtgcagct ggatggcact catgaccttc 
gggctgtacc agggcgtgcc cagagctgag 
gaaggcgttc ggatgggcag cctggggctg 
tctctggtca tggaccggct ggtgcagcga 
gtggoagctt tccctgtggc tgccggtgcc 
acagcttcag ccgccctcac cgggttcacc 
ctggcctccc tctaccaccg ggagaagcag 
ggaggtgcta gcagtgagga cagcctgatg 
gctcccttcc ctaatggaca cgtgggtgct 
gcgctctgcg gggcctctgc ctgtgatgtc 
gaggccaggg tggttccggg ccggggcatc 



cgcctgctgc ggcaccggaa agcccagctc 60 
gaggtgtgtt tggccgcagg catcacctat 120 
gaggagaagt tcatgaccat ggtgctgggc 180 
ccgctcctag gctcagccag tgaccactgg 240 
atctgggcac tgtccttggg catcctgctg 300 
ctagcagggc tgctgtgccc ggatcccagg 360 
gtggggctgc tggacttctg tggccaggtg 420 
gacctcttcc gggacccgga ccactgtcgc 480 
agtcttgggg gctgcctggg ctacctcctg 540 
gccccctacc tgggeaccca ggaggagtgc 600 
acctgcgtag cagccacact gctggtggct 660 
gcagaagggc tgtcggcccc ctccttgtcg 720 
gctttccgga acctgggcgc cctgcttccc 780 
cgcaccctgc gccggctctt cgtggctgag 840 
acgctgtttt acacggattt cgtgggcgag 900 
ccgggcaccg aggcccggag acactatgat 960 
ttcctgcagt gcgccatctc cctggtcttc 1020 
ttcggcactc gagcagtcta tttggccagt 1080 
acatgcctgt cccacagtgt ggccgtggtg 1140 
ttctcagccc tgcagatcct gccctacaca 1200 
gtgttcctgc ccaaataccg aggggacact 1260 
accagcttcc tgccaggccc taagcctgga 1320 
ggaggcagtg gcctgctccc acctccaccc 1380 
tccgtacgtg tggtggtggg tgagcccacc 1440 
tgcctggacc tcgccatcct ggatagtgcc 1500 



ttcctgctgt cccaggtggc cccatccctg tttatgggct ccattgtcca gctcagccag 1560 
tctgtcactg cctatatggt gtctgccgca ggcctgggtc tggtcgccat ttactttgct 1620 
acacaggtag tatttgacaa gagcgacttg gccaaatact cagcgtaggt cgag 1674 

<210> 18 
<211> 1947 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Hybrid gene between St. pneum. C-LytA, P2 T helper epitope and human P501S. 
<400> 18 
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gccaccatgg cggccgctta cgtacattcc 
aaaatcaatg gcacttggta ctactttgac 
aggaagcaca cagacggcaa ctggtactgg 
tggaagaaaa tcgctgataa gtggtactat 
tgggtcaagt acaaggacac ttggtactac 
atcaaggcta actctaagtt cattggtatc 
atccagtcag cggacggaac aggctggtac 
aggccagaaa agttcatgta catggtgctg 
gtcccgctcc taggctcagc cagtgaccac 
ttcatctggg cactgtcctt gggcatcctg 
tggctagcag ggctgctgtg cccggatccc 
ggcgtggggc tgctggactt ctgtggccag 
tctgacctct tccgggaccc ggaccactgt 
atcagtcttg ggggctgcct gggctacctc 
ctggccccct acctgggcac ccaggaggag 
ctcacctgcg tagcagccac actgctggtg 
ccagcagaag ggctgtcggc cccctccttg 
ttggctttcc ggaacctggg cgccctgctt 
ccccgcaccc tgcgccggct cttcgtggct 
ttcacgctgt tttacacgga tttcgtgggc 
gagccgggca ccgaggcccg gagacactat 
ctgttcctgc agtgcgccat ctccctggtc 
cgattcggca ctcgagcagt ctatttggcc 
gccacatgcc tgtcccacag tgtggccgtg 
accttctcag ccctgcagat cctgccctac 
caggtgttcc tgcccaaata ccgaggggac 
atgaccagct tcctgccagg cccfcaagcct 
gctggaggca gtggcctgct cccacctcca 
gtctccgtac gtgtggtggt gggtgagccc 
atctgcctgg acctcgccat cctggatagt 
ctgtttatgg gctccattgt coagctcagc 
gcaggcctgg gtctggtcgc catttacttt 
ttggccaaat actcagcgta ggtcgag 



gacggctctt atccaaaaga caagtttgag 60 
agttcaggct atatgcttgc agaccgctgg 120 
ttcgacaact caggcgaaat ggctacaggc 180 
ttcaacgaag aaggtgccat gaagacaggc 240 
ttagacgcta aagaaggcgc catgcaatac 300 
actgaaggcg tcatggtatc aaatgccttt 360 
tacctcaaac cagacggaac actggcagac 420 
ggcattggtc cagtgctggg cctggtctgt 480 
tggcgtggac gctatggccg ccgccggccc 540 
ctgagcctct ttctcatccc aagggccggc 600 
aggcccctgg agctggcact gctcatcctg 660 
gtgtgcttca ctccactgga ggccctgctc 720 
cgccaggcct actctgtcta tgccttcatg 780 
ctgcctgcca ttgactggga caccagtgcc 840 
tgcctctttg gcctgctcac cctcatcttc 900 
gctgaggagg cagcgctggg ccccaccgag 960 
tcgccccact gctgtccatg ccgggcccgc 1020 
ccccggctgc accagctgtg ctgccgcatg 1080 
gagctgtgca gctggatggc actcatgacc 1140 
gaggggctgt accagggcgt gcccagagct 1200 
gatgaaggcg ttcggatggg cagcctgggg 1260 
ttctctctgg tcatggaccg gctggtgcag 1320 
agtgtggcag ctttccctgt ggctgccggt 1380 
gtgacagctt cagccgccct caccgggttc 1440 
acactggcct ccctctacca ccgggagaag 1500 
actggaggtg ctagcagtga ggacagcctg 1560 
ggagctccct tccctaatgg acacgtgggt 1620 
cccgcgctct gcggggcctc tgcctgtgat 1680 
accgaggcca gggtggttcc gggccggggc 174 0 
gccttcctgc tgtcccaggt ggccccatcc 1800 
cagtctgtca ctgcctatat ggtgtctgcc 1860 
gctacacagg tagtatttga caagagcgac 1920 
1947 



<210> 19 
<211> 1662 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Codon optimised human P501S 
<400> 19 



atggtgcagc ggctctgggt gagccgcctc 
gtgaatctgc tcacattcgg cctggaggtg 
cccctcctgc tggaggtggg agtcgaggag 
cccgtcctgg gcctcgtgtg cgtgcctctc 



ctgcggcatc gcaaggccca gctcctgctg 60 
tgcctggccg ccggcatcac ctacgtgccc 120 
aagttcatga ccatggtgct gggcattggg 180 
ctcggcagcg cttccgacca ttggcgcggc 240 
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cggtatggcc gcaggagacc cttcatctgg gctctgagtc tcggcatcct gctgagcctg 300 

ttcctgatcc ctcgggccgg ctggctggcc gggctgctgt gccccgatcc tcggcccctg 360 

gagctggccc tgctgatcct cggcgtgggc ctgctggact tctgcggcca ggtgtgcttc 420 

acgcccctgg aggcactgct gagcgacctg ttccgggacc ccgaccattg ccgccaggcg 480 

tacagcgtgt acgccttcat gatctccctg ggaggctgcc tgggctacct gctccccgcc 540 

atcgattggg acaccagcgc actcgccccc tatctcggaa cacaggagga atgcctgttc 600 

ggattgttga cgctcatctt cctcacgtgc gtcgcggcca ccctgttggt ggccgaggag 660 

gccgccctgg ggcccaccga gccggccgag ggactgagcg ccccgagcct gagtccacac 720 

tgctgccctt gccgggcccg cctggccttc cgtaatctgg gcgccctcct gcctcggctc 780 

catcagctgt gttgcagaat gcctaggacg ctgcggcgce tgttcgtcgc tgagttgtgc 840 

tcctggatgg ctctcatgac cttcaccctg ttttatacgg acttcgtcgg ggagggcctg 900 

taccaggggg tgccgcgcgc cgagcccggg acagaggcgc gccgccacta cgacgaggga 960 

gtgcgtatgg gctccctggg cctcttcttg cagtgcgcca tcagtctggt tttctctctg 1020 

gtcatggaca ggctggtgca gcgcttcgga acccgggcgg tgtacctggc gagcgtggcc 1080 

gccttccccg tggctgccgg cgccacctgc ctctctcact cggtggccgt ggtcaccgcc 1140 

agcgccgccc tgaccgggtt caccttctct gccctgcaga ttctgcctta caccctggcc 1200 

agcctgtacc atcgcgagaa acaggtgttt ctccccaagt acagaggcga caccgggggc 1260 

gcctccagcg aggacagcct catgacctcc ttcctgcctg gccccaagcc cggcgcccct 1320 

ttccccaacg ggcacgtggg cgccggcggg agtgggctcc tgcccccccc tcctgcgctg 1380 

tgcggggcca gcgcctgcga cgtgagcgtg cgcgtggtgg tgggcgagcc caccgaggcc 1440 

cgcgtggtgc cgggcagagg catttgtctg gacctggcca tcctcgactc cgccttcctc 1500 

ctcagccagg tggccccgtc cctcttcatg ggctctatcg tccagctgtc tcagagcgtc 1560 

accgcttaca tggtgtccgc tgctggactg ggcttggtgg ctatttattt cgccacccag 1620 

gtggtgttcg acaagagcga cctggccaaa tactccgcct ga 1662 



<210> 20 
<211> 1662 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Codon optimised human P501S 
<400> 20 
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atggtgcagc ggctgtgggt gtcccggctg ctgcgccata gaaaggccca gttgctgctg 60 
gtgaacctgc tgactttcgg actggaggtg tgcctggctg ccgggatcac gtacgtgccc 120 
cccctgctgc tggaggtggg cgtggaggag aagttcatga caatggtgct gggcatcggc 180 
cccgtcctgg gcctcgtgtg tgtgcccctc ctcgggagtg cgtccgatca ttggcggggc 240 
cgctacggcc gccgcagacc gttcatctgg gccctgagcc tggggatcct gctctctctc 300 
ttcctgatcc cccgggccgg ctggctggcc ggcctgctgt gtcccgaccc ccgccctctg 360 
gagctggccc tcctgatcct gggcgtgggc ttgttggact tctgcggcca ggtgtgtttc 420 
actcccctgg aggctctgct ctccgacctc ttccgcgacc ccgaccactg taggcaggct 480 
tacagcgtgt acgccttcat gatcagtctg gggggatgcc tgggctatct gctgcccgct 540 
atcgactggg acaccagcgc cctggccccc tacctgggga ctcaggagga gtgcctgttc 600 
ggcctgctca ccttgatctt cctgacgtgc gtcgccgcca ccctgctggt ggccgaggag 660 
gcggccctgg ggcccaccga gcccgccgag ggcctgagcg ctcccagcct gagcccccat 72 0 
tgctgcccgt gcagggctag gctcgccttc aggaatctgg gcgctttgct gccccgcctg 780 
catcagctgt gctgtcgcat gcctcgcacc ctgcgccgcc tgttcgtcgc tgagctctgt 840 
tcctggatgg ccctgatgac gttcaccctc ttctacaccg acttcgtggg ggagggcctg 900 
taccagggcg tgcccagggc cgagcccggc accgaggcta ggcgccatta cgacgagggc 960 
gtcaggatgg gctctctggg cctcttcctg cagtgcgcca tcagtctggt gttctctctg 1020 
gtgatggacc ggctggtgca gcgcttcggc acccgggccg tgtacctcgc ctctgtggcg 1080 
gctttccccg tcgccgccgg cgcgacctgc ctgtctcatt ctgtcgccgt ggtgaccgcc 1140 
agcgccgccc tgaccggctt caccttcagt gcgctccaga ttctgcccta caccctggcg 1200 
tctctgtacc atcgcgagaa gcaggtgttc ctgcccaagt accgcgggga cacaggggga 1260 
gcttcctctg aggacagcct gatgaccagc ttcttgcccg gccccaagcc gggggcccct 1320 
ttccccaacg gccatgtcgg ggcgggcggc agcggcctgc tccctccccc ccccgccctg 1380 
tgcggcgcta gtgcctgcga cgtgagcgtg cgggtggtgg tgggggagcc caccgaggct 1440 
agggtcgtgc ctggccgggg gatctgcctg gacctggcca tcctcgactc cgccttcctg 1S00 
ctctcccagg tggcgcccag cctgttcatg ggcagtatcg tgcagctgag ccagagcgtg 1560 
accgcctaca tggtgagcgc cgccggcctg gggttggtgg ccatctactt tgccacccag 1620 



gtcgtgttcg acaagagcga tctcgccaag tatagcgcct ga 1662 



<210> 21 
<211> 1688 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Codon optimised human P501S 



<400> 21 
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gacggctagc gccaccatgg tgcagcggct 
ggcccagctc ctgctggtga atctgctcac 
catcacctac gtgccccccc tcctgctgga 
ggtgctgggc attgggcccg tcctgggcct 
cgaccattgg cgcggccggt atggccgcag 
catcctgctg agcctgttcc tgatccctcg 
cgatcctcgg cccctggagc tggccctgct 
cggccaggtg tgcttcacgc ccctggaggc 
ccattgccgc caggcgtaca gcgtgtacgc 
ctacctgctc cccgccatcg attgggacac 
ggaggaatgc ctgttcggac tgctgacgct 
gttggtggcc gaggaggccg ccctggggcc 
gagcctgagt ccacactgct gcccttgccg 
cctcctgcct cggctceatc agctgtgttg 
cgtcgctgag ttgtgctcct ggatggctct 
cgtcggggag ggcctgtacc agggggtgcc 
ccactacgac gagggagtgc gtatgggctc 
tctggttttc tctctggtca tggacaggct 
cctggcgagc gtggccgcct tccccgtggc 
ggccgtggtc accgccagcg ccgccctgac 
gccttacacc ctggccagcc tgtaccatcg 
aggcgacacc gggggcgcct ccagcgagga 
caagcccggc gcccctttcc ccaacgggca 
cccccctcct gcgctgtgcg gggccagcgc 
cgagcccacc gaggcccgcg tggtgccggg 
cgactccgcc ttcctcctca gccaggtggc 
gctgtctcag agcgtcaccg cttacatggt 
ttatttcgcc acccaggtgg tgttcgacaa 
cgaggcag 



ctgggtgagc cgcctcctgc ggcatcgcaa 60 
attcggcctg gaggtgtgcc tggccgccgg 120 
ggtgggagtc gaggagaagt tcatgaccat 180 
cgtgtgcgtg cctctcctcg gcagcgcttc 240 
gagacccttc atctgggctc tgagtctcgg 300 
ggccggctgg ctggccgggc tgctgtgccc 360 
gatcctcggc gtgggcctgc tggacttctg 420 
actgctgagc gacctgttcc gggaccccga 480 
cttcatgatc tccctgggag gctgcctggg S40 
cagcgcactc gccccctatc tcggaacaca 600 
catcttcctc acgtgcgtcg cggccaccct 660 
caccgagccg gccgagggac tgagcgcccc 720 
ggcccgcctg gccttccgta atctgggcgc 780 
cagaatgcct aggacgctgc ggcgcctgtt 840 
catgaccttc accctgtttt atacggactt 900 
gcgcgccgag cccgggacag aggcgcgccg 960 
cctgggcctc ttcttgcagt gcgccatcag 1020 
ggtgcagcgc ttcggaaccc gggcggtgta 1080 
tgccggcgco acctgcctct ctcactcggt 1140 
cgggttcacc ttctctgccc tgcagattct 1200 
cgagaaacag gtgtttctcc ccaagtacag 1260 
cagcctcatg acctccttcc tgcctggccc 1320 
cgtgggcgcc ggcgggagtg ggctcctgcc 1380 
ctgcgacgtg agcgtgcgcg tggtggtggg 1440 
cagaggcatt tgtctggacc tggccatcct 1500 
cccgtccctc ttcatgggct ctatcgtcca 1560 
gtccgctgct ggactgggct tggtggctat 1620 
gagcgacctg gccaaatact ccgcctgact 1680 
1688 



<210> 22 
<211> 1688 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Codon optimised human P501S 
<400> 22 



gacggctagc gccaccatgg tgcagcggct 
ggcccagttg ctgctggtga acctgctgac 
gatcacgtac gtgccccccc tgctgctgga 
ggtgctgggc atcggccccg tcctgggcct 
cgatcattgg cggggccgct acggccgccg 
catcctgctc tctctcttcc tgatcccccg 
cgacccccgc cctctggagc tggccctcct 
cggccaggtg tgtttcactc ccctggaggc 
ccactgtagg caggcttaca gcgtgtacgc 
ctatctgctg cccgctatcg actgggacac 
ggaggagtgc ctgttcggcc tgctcacctt 



gtgggtgtcc cggctgctgc gccatagaaa 60 
tttcggactg gaggtgtgcc tggctgccgg 120 
ggtgggcgtg gaggagaagt tcatgacaat 180 
cgtgtgtgtg cccctcctcg ggagtgcgtc 240 
cagaccgttc atctgggccc tgagcctggg 300 
ggccggctgg ctggccggcc tgctgtgtcc 360 
gatcctgggc gtgggcctgc tggacttctg 420 
tctgctctcc gacctcttcc gcgaccccga 480 
cttcatgatc agtctggggg gatgcctggg 540 
cagcgccctg gccccctacc tggggactca 600 
gatcttcctg acgtgcgtcg ccgccaccct 660 
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gctggtggcc gaggaggcgg ccctggggcc 
cagcctgagc ccccattgct gcccgtgcag 
tttgctgccc cgcctgcatc agctgtgctg 
cgtcgctgag ctctgttcct ggatggccct 
cgtgggggag ggcctgtacc agggcgtgcc 
ccattacgac gagggcgtca ggatgggctc 
tctggtgttc tctctggtga tggaccggct 
cctcgcctct gtggcggctt tccccgtcgc 
cgccgtggtg accgccagcg ccgccctgac 
gccctacacc ctggcgtctc tgtaccatcg 
cggggacaca gggggagctt cctctgagga 
caagccgggg gcccctttcc ccaacggcca 
tccccccccc gccctgtgcg gcgctagtgc 
ggagcccacc gaggctaggg tcgtgcctgg 
cgactccgcc ttcctgctct cccaggtggc 
gctgagccag agcgtgaccg cctacatggt 
ctactttgcc acccaggtcg tgttcgacaa 
cgaggcag 



caccgagccc gccgagggcc tgagcgctcc 720 
ggctaggctc gccttcagga atctgggcgc 780 
tegcatgcct cgcaccctgc gccgcctgtt 840 
gatgacgttc accctcttct acaccgactt 900 
cagggccgag cccggcaccg aggctaggcg 960 
tctgggcctc ctcctgcagt gcgccatcag 1020 
ggtgcagcgc ttcggcaccc gggccgtgta 1080 
cgccggcgcg acctgcctgt ctcattctgt 114 0 
cggcttcacc ttcagtgcgc tccagattct 1200 
cgagaagcag gtgttcctgc ccaagtaccg 1260 
cagcctgatg accagcttct tgcccggccc 1320 
tgtcggggcg ggcggcagcg gcctgctccc 1380 
ctgcgacgtg agcgtgcggg tggtggtggg 1440 
ccgggggatc tgcctggacc tggccatcct 1500 
gcccagcctg ttcatgggca gtatcgtgca 1560 
gagcgccgcc ggcctggggt tggtggccat 1620 
gagcgatctc gccaagtata gcgcctgact 1680 
1688 



<210> 23 
<211> 435 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Hybrid gene between St. pneum. C-LytA, P2 T helper epitope and a small portion of the 5' end of human P501 S 
<400> 23 



atggcggccg cttacgtaca ttccgacggc 
aatggcactt ggtactactt tgacagttca 
cacacagacg gcaactggta ctggttcgac 
aaaatcgctg ataagtggta ctatttcaac 
aagtacaagg acacttggta ctacttagac 
gctaactcta agtfccattgg tatcactgaa 
tcagcggacg gaacaggctg gtactacctc 
gaaaagttca tgtac 



tcttatccaa aagacaagtt tgagaaaatc 60 
ggctatatgc ttgcagaccg ctggaggaag 120 
aactcaggcg aaatggctac aggctggaag 180 
gaagaaggtg ccatgaagac aggctgggtc 24 0 
gctaaagaag gcgccatgca atacatcaag 300 
ggcgtcatgg tatcaaatgc ctttatccag 360 
aaaccagacg gaacactggc agacaggcca 420 
435 



<210> 24 
<211>435 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Hybrid gene between St. pneum. C-LytA, P2 T helper epitope and a small portion of the 5' end of human 
P501S - codon-optimised 

<400> 24 
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atggccgccg cctacgtgca tagcgacggg 
aacgggacat ggtactactt cgactcctcc 
cacaccgacg gcaactggta ctggttcgat 
aag^tcgcgg acaagtggta ctatttcaac 
aagtataagg acacctggta ctacctcgac 
gccaacagca agttcatcgg catcaccgag 
agcgccgacg gcaccggatg gtactacttg 
gagaagttca tgtac 



agctacccca aggacaagtt cgagaagatc 60 
ggctacatgc tcgccgaccg ctggcggaag 120 
aactcgggag agatggccac cggctggaag 180 
gaggagggcg ccatgaagac cggctgggtg 240 
gccaaggagg gcgccatgca gtatatcaag 300 
ggagtgatgg tcagcaacgc ctttatccag 360 
aagccggacg gcaccctcgc ggatcggccc 4 20 
435 



<210> 25 
<211>435 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Hybrid gene between St. pneum. C-LytA, P2 T helper epitope and a small portion of the 5' end of human 
P501S - codon-optimised 

<400> 25 



atggccgccg cctacgtgca cagcgacggg 
aacggcacgt ggtactattt cgacagcagc 
cacaccgacg ggaactggta ctggttcgac 
aagatcgccg acaagtggta ctacttcaac 
aagtacaagg acacctggta ctacctggac 
gccaactcga agttcatcgg gatcaccgag 
agcgcggacg gcacaggctg gtattacctg 
gagaaattca tgtac 



tcctacccaa aggacaagtt cgagaagatc 60 
ggctacatgc tcgccgatcg ctggcgcaag 120 
aactctggcg agatggctac ggggtggaag 180 
gaggagggcg ccatgaagac cgggtgggtg 240 
gctaaggagg gcgccatgca gtacatcaag 300 
ggcgtgatgg tcagtaacgc tttcatccag 360 
aagcccgatg gcaccctggc ggacagacct 420 
435 



<210> 26 
<211>464 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Hybrid gene between St. pneum. C-LytA, P2 T helper epitope and a small portion of the 5' end of human 
P501S - codon-optimised 

<400> 26 



gacggctagc gccaccatgg ccgccgccta 
caagttcgag aagatcaacg ggacatggta 
cgaccgctgg cggaagcaca ccgacggcaa 
ggccaccggc tggaagaaga tcgcggacaa 
gaagaccggc tgggtgaagt ataaggacac 
catgcagtat atcaaggcca acagcaagtt 
caacgccttt atccagagcg ccgacggcac 
cctcgcggat cggcccgaga agttcatgta 



cgtgcatagc gacgggagct accccaagga 60 
ctacttcgac tcctccggct acatgctcgc 120 
ctggtactgg ttcgataact cgggagagat 180 
gtggtactat ttcaacgagg agggcgccat 240 
ctggtactac ctcgacgcca aggagggcgc 300 
catcggcatc accgagggag tgatggtcag 360 
cggatggtac tacttgaagc cggacggcac 420 
ctgactcgag gcag 464 



<210> 27 
<211>652 
<212> PRT 

<213> Artificial Sequence 
<220> 
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<223> Hybrid protein between St. pneum. C-LytA, P2 T helper epitope and amino acids 51-553 of human P501S 
<400> 27 



Met Ala Ala Ala Tyr Val His Ser Asp Gly Ser Tyr Pro Lys Asp Lys 

IS 10 15 

Phe Glu Lys lie Asn Gly Thr Trp Tyr Tyr Phe Asp Ser Ser Gly Tyr 

20 25 30 

Met Leu Ala Asp Arg Trp Arg Lys His Thr Asp Gly Asn Trp Tyr Trp 

35 40 45 

Phe Asp Asn Ser Gly Glu Met Ala Thr Gly Trp Lys Lys lie Ala Asp 

50 55 60 

Lys Trp Tyr Tyr Phe Asn Glu Glu Gly Ala Met Lys Thr Gly Trp Val 

65 70 75 80 

Lys Tyr Lys Asp Thr Trp Tyr Tyr Leu Asp Ala Lys Glu Gly Ala Met 
85 90 95 
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Gin 


Tyr 


lie 


Lys 


Ala 


Aen 


Ser 


Lys 


Phe 


He 


Gly 


He 


Thr 


Glu 


Gly 


Val 








100 










105 










110 






Met 


Val 


Ser 


Asn 


Ala 


Phe 


He 


Gin 


Ser 


Ala 


Asp 


Gly 


Thr 


Gly 


Trp 


Tyr 






115 










120 










125 








Tyr 


Leu 


Lys 


Pro 


Asp 


Gly 


Thr 


Leu 


Ala 


Asp 


Arg 


Pro 


Glu 


Lys 


Phe 


Met 




130 










135 










140 










Tyr 


Met 


Val 


Leu 


Gly 


lie 


Gly 


Pro 


Val 


Leu 


Gly 


Leu 


Val 


Cys 


Val 


Pro 


145 










150 










155 










160 


Leu 


Leu 


Gly 


Ser 


Ala 


Ser 


Asp 


His 


Trp 


Arg 


Gly 


Arg 


Tyr 


Gly 


Arg 


Arg 










165 










170 










175 




Arg 


Pro 


Phe 


lie 


Trp 


Ala 


Leu 


Ser 


Leu 


Gly 


He 


Leu 


Leu 


Ser 


Leu 


Phe 








180 










185 










190 






Leu 


lie 


Pro 


Arg 


Ala 


Gly 


Trp 


Leu 


Ala 


Gly 


Leu 


Leu 


Cys 


Pro 


Asp 


Pro 






195 










200 










205 








Arg 


Pro 


Leu 


Glu 


Leu 


Ala 


Leu 


Leu 


He 


Leu 


Gly 


Val 


Gly 


Leu 


Leu 


Asp 




210 










215 










220 










Phe 


Cys 


Gly 


Gin 


Val 


cys 


Phe 


Thr 


Pro 


Leu 


Glu 


Ala 


Leu 


Leu 


Ser 


Asp 


225 










230 










235 










240 


Leu 


Phe 


Arg 


Asp 


Pro 


Asp 


His 


Cys 


Arg 


Gin 


Ala 


Tyr 


Ser 


Val 


Tyr 


Ala 










245 










250 










255 




Phe 


Met 


lie 


Ser 


Leu 


Gly 


Gly 


Cye 


Leu 


Gly 


Tyr 


Leu 


Leu 


Pro 


Ala 


He 








260 










265 










270 






Asp 


Trp 


Asp 


Thr 


Ser 


Ala 


Leu 


Ala 


Pro 


Tyr 


Leu 


Gly 


Thr 


Gin 


Glu 


Glu 






275 










280 










285 








Cys 


Leu 


Phe 


Gly 


Leu 


Leu 


Thr 


Leu 


He 


Phe 


Leu 


Thr 


Cys 


Val 


Ala 


Ala 




290 










295 










300 










Thr 


Leu 


Leu 


Val 


Ala 


Glu 


Glu 


Ala 


Ala 


Leu 


Gly 


Pro 


Thr 


Glu 


Pro 


Ala 


305 










310 










315 










320 


Glu 


Gly 


Leu 


Ser 


Ala 


Pro 


Ser 


Leu 


Ser 


Pro 


His 


Cys 


Cys 


Pro 


Cys 


Arg 










325 










330 










335 




Ala 


Ar 3 


Leu 


Ala 


Phe 


Arg 


Asn 


Leu 


Gly 


Ala 


Leu 


Leu 


Pro 


Arg 


Leu 


His 








340 










345 










350 






Gin 


Leu 


Cys 


Cys 


Arg 


Met 


Pro 


Arg 


Thr 


Leu 


Arg 


Arg 


Leu 


Phe 


Val 


Ala 






355 










360 










365 








Glu 


Leu 


Cys 


Ser 


Trp 


Met 


Ala 


Leu 


Met 


Thr 


Phe 


Thr 


Leu 


Phe 


Tyr 


Thr 




370 










375 










380 










Asp 


Phe 


Val 


Gly 


Glu 


Gly 


Leu 


Tyr 


Gin 


Gly 


Val 


Pro 


Axg 


Ala 


Glu 


Pro 


385 










390 










395 










400 


Gly 


Thr 


Glu 


Ala 


Arg 


Arg 


His 


Tyr 


Asp 


Glu 


Gly 


Val 


Arg 


Met 


Gly 


Ser 










405 










410 










415 




Leu 


Gly 


Leu 


Phe 


Leu 


Gin 


Cys 


Ala 


He 


Ser 


Leu 


Val 


Phe 


Ser 


Leu 


Val 








420 










425 










430 






Met 


Asp 


Arg 


Leu 


Val 


Gin 


Arg 


Phe 


Gly 


Thr 


Arg 


Ala 


Val 


Tyr 


Leu 


Ala 






435 










440 










445 








Ser 


Val 


Ala 


Ala 


Phe 


Pro 


Val 


Ala 


Ala 


Gly 


Ala 


Thr 


Cys 


Leu 


Ser 


His 




450 










455 










460 










Ser 


Val 


Ala 


Val 


Val 


Thr 


Ala 


Ser 


Ala 


Ala 


Leu 


Thr 


Gly 


Phe 


Thr 


Phe 


465 










470 










475 










480 


Ser 


Ala 


Leu 


Gin 


lie 


Leu 


Pro 


Tyr 


Thr 


Leu 


Ala 


Ser 


Leu 


Tyr 


His 


Arg 










485 










490 










495 




Glu 


Lys 


Gin 


Val 


Phe 


Leu 


Pro 


Lys 


Tyr 


Arg 


Gly 


Asp 


Thr 


Gly 


Gly 


Ala 








500 










505 










510 






Ser 


Ser 


Glu 


Asp 


Ser 


Leu 


Met 


Thr 


Ser 


Phe 


Leu 


Pro 


Gly 


Pro 


Lys 


Pro 


































Gly 


Ala 


Pro 


Phe 


Pro 


Asn 


Gly 


His 


Val 


Gly 


Ala 


Gly 


Gly 


Ser 


Gly 


Leu 




530 










535 










540 










Leu 


Pro 


Pro 


pro 


Pro 


Ala 


Leu 


Cys 


Gly 


Ala 


Ser 


Ala 




Asp 


Val 


Ser 


545 










550 










555 










560 


Val 


Arg 


val 


val 


Val 


Gly 


Glu 


Pro 


Thr 


Glu 


Ala 


Arg 


Val 


val 


Pro 


Gly 










565 








570 










575 




Arg 


Gly 


lie 


Cys 


Leu 


Asp 


Leu 


Ala 


lie 


Leu 


Asp 


Ser 


Ala 


Phe 


Leu 


Leu 



49 



EP1 511 768 B1 



580 



Ser Gin Val Ala 


Pro 


Ser 


l>eu Phe 


595 






600 


Gin Ser val Thr 


Ala 


Tyr 


Met Val 


610 






615 


Ala lie Tyr Phe 


Ala 


Thr 


Gin Val 


625 




630 




Lys Tyr Ser Ala 


Gly Gly 


His His 




645 







585 


590 


Met Gly 


Ser He Val Gin Leu Ser 


605 


Ser Ala 


Ala Gly Leu Gly Leu Val 




620 


Val Phe 


Asp Lys Ser Asp Leu Ala 




635 640 


His His 


His His 


650 





<210> 28 
<211> 1959 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> DNA encoding the Hybrid protein between St. pneum. C-LytA, P2 T helper epitope and amino acids 51-553 
of human P501S 

<400> 28 



atggcggccg cttacgtaca ttccgacggc 
aatggcactt ggtactactt tgacagttca 
cacacagacg gcaactggta ctggttcgac 
aaaatcgctg ataagtggta ctatttcaac 
aagtacaagg acacttggta ctacttagac 
gctaactcta agttcattgg tatcactgaa 
tcagcggacg gaacaggctg gtactaccte 
gaaaagttca tgtacatggt gctgggcatt 
ctcctaggct cagccagtga ccactggcgt 
tgggcactgt ccttgggcat cctgctgagc 
gcagggctgc tgtgcccgga tcccaggccc 
gggctgctgg acttctgtgg ccaggtgtgc 
ctcttccggg acccggacca ctgtcgccag 
cttgggggct gcctgggcta cctcctgcct 
ccctacctgg gcacccagga ggagtgcctc 
tgcgtagcag ccacactgct ggtggctgag 
gaagggctgt cggccccctc cttgtcgccc 
ttccggaacc tgggcgccct gcttccccgg 
accctgcgcc ggctcttcgt ggctgagctg 
ctgttttaca cggatttcgt gggcgagggg 
ggcaccgagg cccggagaca ctatgatgaa 
ctgcagtgcg ccatctccct ggtcttctct 
ggcactcgag cagtctattt ggccagtgtg 
tgcctgtccc acagtgtggc cgtggtgaca 
tcagccctgc agatcctgcc ctacacactg 
ttcctgccca aataccgagg ggacactgga 
agcttcctgc caggccctaa gcctggagct 
ggcagtggcc tgctcccacc tccacccgcg 
gtacgtgtgg tggtgggtga gcccaccgag 
ctggacctcg ccatcctgga tagtgccttc 
atgggctcca ttgtccagct cagccagtct 
ctgggtctgg tcgccattta ctttgctaca 
aaatactcag cgggtggaca ccatcaccat 



tcttatccaa aagacaagtt tgagaaaatc 60 
ggctatatgc ttgcagaccg ctggaggaag 120 
aactcaggcg aaatggctac aggctggaag 180 
gaagaaggtg ccatgaagac aggctgggtc 240 
gctaaagaag gcgccatgca atacatcaag 300 
ggcgtcatgg tatcaaatgc ctttatccag 360 
aaaccagacg gaacactggc agacaggcca 420 
ggtccagtgc tgggcctggt ctgtgtcccg 480 
ggacgctatg gccgccgccg gcccttcatc 540 
ctctttctca tcccaagggc cggctggcta 600 
ctggagctgg cactgctcat cctgggcgtg 660 
ttcactccac tggaggccct gctctctgac 720 
gcctactctg tctatgcctt catgatcagt 780 
gccattgact gggacaccag tgccctggcc 840 
tttggcctgc tcaccctcat cttcctcacc 900 
gaggcagcgc tgggccccac cgagccagca 960 
cactgctgtc catgccgggc ccgcttggct 1020 
ctgcaccagc tgtgctgccg catgccccgc 1080 
tgcagctgga tggcactcat gaccttcacg 1140 
ctgtaccagg gcgtgcccag agctgagccg 1200 
ggcgttcgga tgggcagcct ggggctgttc 1260 
ctggtcatgg accggctggt gcagcgattc 1320 
gcagctttcc ctgtggctgc cggtgccaca 1380 
gcttcagccg ccctcaccgg gttcaccttc 144 0 
gcctccctct accaccggga gaagcaggtg 1500 
ggtgctagca gtgaggacag cctgatgacc 1560 
cccttcccta atggacacgt gggtgctgga 1620 
ctctgcgggg cctctgcctg tgatgtctcc 1680 
gccagggtgg ttccgggccg gggcatctgc 174 0 
ctgctgtccc aggtggcccc atccctgttt 1300 
gtcactgcct atatggtgtc tgccgcaggc 1860 
caggtagtat ttgacaagag cgacttggcc 192 0 
caccattaa 1959 



<210> 29 
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<211> 507 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Human P501S (amino acids 55-553) fused to 6 histtdine residues 
<400> 29 
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Met 


Val 


Leu 


Gly 


He 


Gly 




1 








s 








Gly 


Ser 


Ala 




Asp 










20 






5 


Pro 


Phe 


He 


Trp 


Ala 


Leu 








35 










He 


Pro 


Arg 


Ala 


Gly 


Trp 






50 












Pro 




Glu 


Leu 


Ala 


Leu 


10 


65 










70 




Cys 


Gly 


Gin 


Val 


Cys 
85 


Phe 




Phe 


Arg 


Asp 


Pro 


Asp 


His 










100 






15 


Met 


He 


Ser 




Gly 


Gly 








115 










Trp 


Asp 


Thr 


Ser 


Ala 


Leu 






130 












Leu 


Phe 


Gly 


Leu 


Leu 


Thr 


20 


145 










150 






Leu 


Val 


Ala 


Glu 


Glu 












165 






Gly 


Leu 


Ser 


Ala 


Pro 


Ser 
















25 


Arq 


Leu 


Ala 


Phe 


Arq 


Asn 








195 










Leu 


Cys 


Cys 


A 


Met 


Pro 






210 












Leu 


Cys 


Ser 


Trp 


Met 


Ala 


30 


225 












Phe 


Val 


Gly 


Glu 


Gly 


Leu 












245 






Thr 


Glu 


Ala 


Arg 


Arg 


His 










260 








Gly 


Leu 


Phe 


Leu 


Gin 


Cys 


35 






275 












Arg 




Val 


Gin 


Arg 






290 












Val 


Ala 


Ala 


Phe 


Pro 


Val 




305 










310 


40 


Val 


Ala 


Val 


Val 


Thr 


Ala 












325 






Ala 


Leu 


Gin 


lie 




Pro 










340 








Lvs 


Gin 


Val 




Leu 


Pro 


45 
















Ser 


Glu 


Asd 


Ser 


Leu 


Met 






370 












Ala 


Pro 


Phe 


Pro 


Asn 


Gly 




385 










390 




Pro 


Pro 


Pro 


Pro 


Ala 


Leu 


50 










405 






Arg 


Val 


Val 


val 


Gly 


Glu 










420 








Gly 


He 


Cys 


Leu 


Asp 


Leu 



Pro 


Val 


Leu 


Gly Leu Val Cys 


Val 


Pro Leu 








10 




15 


His 


Trp 


Arg 


Gly Arg Tyr Gly 


Arg 


^9 Ar9 






25 




30 




Ser 


Leu 


Gly 


He Leu Leu Ser 


Leu 


Phe Leu 




40 




45 






Leu 


Ala 


Gly 


Leu Leu Cys Pro 


Asp 


Pro Arg 


55 






60 








He 


Leu 


Gly Val Gly Leu 


Leu 


Asp Phe 








75 




80 


Thr 


Pro 


Leu 


Glu Ala Leu Leu 


Ser 


Asp Leu 














Cys 


Arq 


Gin 


Ala Tyr Ser Val 


Tyr 


Ala Phe 






105 




110 




Cvs 


Leu 


Gly 


Tyr Leu Leu Pro 


Ala 


He Asp 




120 




125 






Ala 


Pro 


Tyr 


Leu Gly Thr Gin 


Glu 


Glu Cys 


135 






140 








He 


Phe 


Leu Thr Cys Val 


Ala 


Ala Thr 








155 




160 


Ala 


Ala 


Leu 


Gly Pro Thr Glu 


Pro 


Ala Glu 








170 




175 


Leu 


Ser 


Pro 


His Cys Cys Pro 


Cys 


Arg Ala 






185 




190 




Leu 


Gly 


Ala 


Leu Leu Pro Arg 


Leu 


His Gin 




200 




205 






Arg 


Thr 


Leu 


Arg Arg Leu Phe 


Val 


Ala Glu 


215 






220 








Met 


Thr 


Phe Thr Leu Phe 


Tyr 


Thr Asp 








235 




240 


Tyr 


Gin 


Gly 


Val Pro Arg Ala 


Glu 










250 




255 


Tyr 


ASD 


Glu 


Gly Val Arg Met 


Gly 








265 




270 




Ala 


lie 




Leu Val Phe Ser 




Val Met 




280 




285 






Phe 


Gly 


Thr 


Arg Ala Val Tyr 


Leu 


Ala Ser 


295 






300 






Ala 


Ala 


Gly 


Ala Thr Cys Leu 


Ser 


His Ser 








315 




320 


Ser 


Ala 


Ala 


Leu Thr Gly Phe 


Thr 


Phe Ser 








330 




335 


Tyr 


Thr 


Leu 


Ala Ser Leu Tyr 


His 


Arg Glu 






345 




350 




Lys 


Tyr 


Arg 


Gly Asp Thr Gly 


Gly 


Ala Ser 




360 




365 






Thr 


Ser 


Phe 


Leu Pro Gly Pro 


Lys 


Pro Gly 


375 






380 






Hia 


val 


Gly 


Ala Gly Gly Ser 


Gly 


Leu Leu 








395 




400 


Cys 


Gly 


Ala 


Ser Ala Cys Asp 


val 


Ser val 








410 




415 


Pro 


Thr 


Glu 


Ala Arg Val Val 


Pro 


Gly Arg 






425 




430 




Ala 


He 


Leu 


Asp Ser Ala Phe 


Leu 


Leu Ser 




440 




445 
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Gin 


Val Ala Pro 


Ser 


Leu Phe Met Gly Ser 


lie Val Gin Leu Ser Gin 












Ser 


val Thr Ala 


Tyr 


Met Val Ser Ala Ala 


Gly Leu Gly Leu Val Ala 


465 






470 


475 480 


lie 


Tyr Phe Ala 


Thr 


Gin Val val Phe Asp 


Lys Ser Asp Leu Ala Lys 






485 


490 


495 


Tyr 


Ser Ala Gly 


Gly 


His His His His His 


His 




500 




S05 





<210> 30 
<211> 1524 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> DNA encoding Human P501S (amino acids 55-553) fused to 6 histidine residues 
<400> 30 



atggtgctgg gcattggtcc agtgctgggc ctggtctgtg tcccgctcct aggctcagcc 60 
agtgaccact ggcgtggacg ctatggccgc cgccggccct tcatctgggc actgtccttg 120 
ggcatcctgc tgagcctctt tctcatccca agggccggct ggctagcagg gctgctgtgc 180 
ccggatccca ggcccctgga gctggcactg ctcatcctgg gcgtggggct gctggacttc 240 
tgtggccagg tgtgcttcac tccactggag gccctgctct ctgacctctt ccgggacccg 300 
gaccactgtc gccaggccta ctctgtctat gccttcatga tcagtcttgg gggctgcctg 360 
ggctacctcc tgcctgccat tgactgggac accagtgccc tggcccccta cctgggcacc 420 
caggaggagt gcctctttgg cctgctcacc ctcatcttcc tcacctgcgt agcagccaca 480 
ctgctggtgg ctgaggaggc agcgctgggc cccaccgagc cagcagaagg gctgtcggcc 54 0 
ccctccttgt cgccccactg ctgtccatgc cgggcccgct tggctttccg gaacctgggc 600 
gccctgcttc cccggctgca ccagctgtgc tgccgcatgc cccgcaccct gcgccggotc 660 
ttcgtggctg agctgtgcag ctggatggca ctcatgacct tcacgctgtt ttacacggat 720 
ttcgtgggcg aggggctgta ccagggcgtg cccagagctg agccgggcac cgaggcccgg 780 
agacactatg atgaaggcgt tcggatgggc agcctggggc tgttcctgca gtgcgccatc 840 
tccctggtct tctctctggt catggaccgg ctggtgcagc gattcggcac tcgagcagtc 900 
tatttggcca gtgtggcagc tttccctgtg gctgccggtg ccacatgcct gtcccacagt 960 
gtggccgtgg tgacagcttc agccgccctc accgggttca ccttctcagc cctgcagatc 1020 
ctgccctaca cactggcctc cctctaccac cgggagaagc aggtgttcct gcccaaatac 1080 
cgaggggaca ctggaggtgc tagcagtgag gacagcctga tgaccagctt cctgccaggc 1140 
cctaagcctg gagctccctt ccctaatgga cacgtgggtg ctggaggcag tggcctgctc 1200 
ccacctccac ccgcgctctg cggggcctct gcctgtgatg tctccgtacg tgtggtggtg 1260 
ggtgagccca ccgaggccag ggtggttccg ggccggggca tctgcctgga cctcgccatc 1320 
ctggatagtg ccttcctgct gtcccaggtg gccccatccc tgtttatggg ctccattgtc 1380 
cagctcagcc agtctgtcac tgcctatatg gtgtctgccg caggcctggg tctggtcgcc 1440 
atttactttg ctacacaggt agtatttgac aagagcgact tggccaaata ctcagcgggt 1500 
ggacaccatc accatcacca ttaa 1524 



<210>31 
<211>685 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Human P501S (amino acids 1-34 fused to 55-553) fused to 6 histidine residues 



<400> 31 
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Met Ala Ala Val Gin Arg Leu Trp Val Ser Arg Leu Leu Arg His Arg 

1 5 iO 15 

Lys Ala Gin Leu Leu Leu Val Asn Leu Leu Thr Phe Gly Leu Glu Val 
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Cys 


Leu 


Ala 


Ala 


Ala 


Tyr 


Val 


His ser 


Asp 


Gly 


Ser 


Tyr 


Pro Lys Asp 






35 










40 








45 




Lys 


Phe 


Glu 


Lys 


He 


Asn 


Gly 


Thr Trp 


Tyr 


Tyr 


Phe 


Asp 


Ser Ser Gly 




50 










55 








60 






Tyr 


Met 


Leu 


Ala 


Asp 


Arg 


Trp 


Arg Lys 


His 


Thr 


Asp 


Gly 


Asn Trp Tyr 


65 










70 








75 






80 


Trp 


Phe 


Asp 


Aen 


Ser 


Gly 


Glu 


Met Ala 


Thr 


Gly 


Trp 


Lys 


Lys He Ala 










85 








90 








95 


Asp 


Lys 


Trp 


Tyr 


Tyr 


Phe 


Asn 


Glu Glu 


Gly 


Ala 


Met 


Lys 


Thr Gly Trp 








100 








105 










110 


Val 


Lys 


Tyr 


Lys 


Asp 


Thr 


Trp 


Tyr Tyr 


Leu 


Asp 


Ala 


Lys 


Glu Gly Ala 






115 










120 








125 




Met 


Gin 


Tyr 


He 


Lys 


Ala 


Asn 


ser Lys 


Phe 


He 


Gly 


He 


Thr Glu Gly 




130 










135 








140 






Val 


Met 


Val 


Ser 


Asn 


Ala 


Phe 


He Gin 


Ser 


Ala 


Asp 


Gly 


Thr Gly Trp 


145 










ISO 








155 






160 


Tyr 


Tyr 


Leu 


Lys 


Pro 


Asp 


Gly 


Thr Leu 


Ala 


Asp 


Arg 


Pro 


Glu Lys Phe 










165 








170 








175 


Met 


Tyr 


Met 


Val 


Leu 


Gly 


He 


Gly Pro 


Val 


Leu 


Gly 


Leu 


Val Cys Val 








180 








185 










190 


Pro 


Leu 


Leu 


Gly 


Ser 


Ala 


Ser 


ASp His 


Trp 


Arg 


Gly 


Arg 


Tyr Gly Arg 






195 










200 








205 




Arg 


Arg 


Pro 


Phe 


He 


Trp 


Ala 


Leu Ser 


Leu 


Gly 


He 


Leu 


Leu Ser Leu 




210 










215 








220 






Phe 


Leu 


lie 


Pro 


Arg 


Ala 


Gly 


Trp Leu 


Ala 


Gly 


Leu 


Leu 


Cys Pro Asp 


225 










230 








235 






240 


Pro 


Arg 


Pro 


Leu 


Glu 


Leu 


Ala 


Leu Leu 


He 


Leu 


Gly 


Val 


Gly Leu Leu 










245 








250 








255 


Asp 


Phe 


Cys 


Gly 


Gin 


Val 


Cys 


Phe Thr 


Pro 


Leu 


Glu 


Ala 


Leu Leu Ser 








260 








265 










270 


Asp 


Leu 


Phe 


Arg 


Asp 


Pro 


Asp 


His Cys 


Arg 


Gin 


Ala 


Tyr 


Ser Val Tyr 






27S 










280 








285 




Ala 


Phe 


Met 


He 


Ser 


Leu 


Gly 


Gly Cys 


Leu 


Gly 


Tyr 


Leu 


Leu Pro Ala 




290 










295 








300 






lie 


Asp 


Trp 


Asp 


Thr 


Ser 


Ala 


Leu Ala 


Pro 


Tyr 


Leu 


Gly 


Thr Gin Glu 


305 










310 








315 






320 


GlU 


Cys 


Leu 


Phe 


Gly 


Leu 


Leu 


Thr Leu 


He 


Phe 


Leu 


Thr 


Cys Val Ala 










325 








330 








335 


Ala 


Thr 


Leu 


Leu 


Val 


Ala 


Glu 


Glu Ala 


Ala 


Leu 


Gly 


Pro 


Thr Glu Pro 








340 








345 










350 


Ala 


Glu 


Gly 


Leu 


Ser 


Ala 


Pro 


Ser Leu 


Ser 


Pro 


His 


Cys 


Cys Pro Cys 






355 










360 








365 




Arg 


Ala 


Arg 


Leu 


Ala 


Phe 


Arg 


Asn Leu 


Gly 


Ala 


Leu 


Leu 


Pro Arg Leu 




370 










375 








380 






His 


Gin 


Leu 


Cys 


Cys 


Arg 


Met 


Pro Arg 


Thr 


Leu 


Arg 


Arg 


Leu Phe Val 


385 










390 








395 






400 


Ala 


Glu 


Leu 


Cys 


Ser 


Trp 


Met 


Ala Leu 


Met 


Thr 


Phe 


Thr 


Leu Phe Tyr 










405 








410 








415 


Thr 


Asp 


Phe 


Val 


Gly 


Glu 


Gly 


Leu Tyr 


Gin 


Gly 


Val 


Pro 


Arg Ala Glu 








420 








425 










430 


Pro 


Gly 


Thr 


Glu 


Ala 


Arg 


Arg 


His Tyr 


Asp 


Glu 


Gly 


Val 


Arg Met Gly 






435 










440 








445 




Ser 




Gly 


Leu 


Phe 


Leu 


Gin 


Cys Ala 


He 


Ser 




Val 


Phe Ser Leu 




450 


















460 






Val 


Met 




Arg 


Leu 


Val 


Gin 




Gly 


Thr 


Arg 


Ala 


Val Tyr Leu 


465 










470 








475 






480 


Ala 




Val 


Ala 


Ala 


Phe 


Pro 


Val Ala 


Ala 


Gly 


Ala 


Thr 


Cys Leu Ser 










485 








490 








495 


His 


Ser 


Val 


Ala 


Val 


Val 


Thr 


Ala Ser 


Ala 


Ala 


Leu 


Thr 


Gly Phe Thr 








500 








505 










510 


Phe 


Ser 


Ala 


Leu 


Gin 


He 


Leu 


Pro Tyr 


Thr 


Leu 


Ala 


Ser 


Leu Tyr His 



55 
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515 




520 


525 








Arg 


Glu 


Lys 


Gin Val Phe Leu 


Pro Lyg Tyr Arg Gly 


Asp 


Thr 


Gly 


Gly 




530 




535 












Ala 


Ser 


Ser 


Glu Asp Ser Leu 


Met Thr Ser Phe Leu 


Pro 


Gly 


Pro 


Lys 


545 






550 


555 








560 


Pro 


Gly Ala 


Pro Phe Pro Asn 


Gly Hig Val Gly Ala 


Gly 


Gly 


Ser 


Gly 








565 


570 






575 




Leu 


Leu 


Pro 


Pro Pro Pro Ala 


Leu Cys Gly Ala Ser 


Ala 


Cys 


Asp 


Val 








580 


585 




590 






Ser 


Val 


Arg 


Val Val Val Gly 


Glu Pro Thr Glu Ala 


Arg 


Val 


Val 


Pro 






595 




600 


605 








Gly 


Arg Gly 


He Cys Leu Asp 


Leu Ala He Leu Asp 


Ser 


Ala 


Phe 


Leu 




610 
















Leu 


Ser 


Gin 


Val Ala Pro ser 


Leu Pne Met Gly Ser 


He 


Val 


Gin 


Leu 


625 






630 


635 








640 


Ser 


Gin 


Ser 


Val Thr Ala Tyr 


Met Val Ser Ala Ala 


Gly 


Leu 


Gly 


Leu 








645 


650 






655 




Val 


Ala 


He 


Tyr Phe Ala Thr 


Gin Val Val Phe Asp 


Lys 


Ser 


Asp 


Leu 








660 


665 




670 






Ala 


Lys 


Tyr 


Ser Ala Gly Gly 


His His His His His 


His 












675 




680 


685 









<211>2058 
<212> DNA 

<213> Artificial Sequence 
30 <220> 

<223> DNA encoding Human P501S (amino acids 1-34 fused to 55-553) fused to 6 histidine residues 
<400> 32 
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atggcggccg tgcagaggct atgggtatcg 
ttgttggtta acttgttgac cttcgggctg 
tccgacggct cttatccaaa agacaagttc 
gacagttcag gctatatgct tgcagaccgc 
tggttcgaca actcaggcga aatggctaca 
tatttcaacg aagaaggtgc catgaagaca 
tacttagacg ctaaagaagg cgccatgcaa 
atcactgaag gcgtcatggt atcaaatgcc 
tactacctca aaccagacgg aacactggca 
ctgggcattg gfcccagtgct gggcctggtc 
cactggcgtg gacgctatgg ccgccgccgg 
ctgctgagcc tctttctcat cccaagggcc 
cccaggcccc tggagctggc actgctcatc 
caggtgtgct tcactccact ggaggccctg 
tgtcgccagg cctactctgt ctatgccttc 
ctcctgcctg ccattgactg ggacaccagt 
gagtgcctct ttggcctgct caccctcatc 
gtggctgagg aggcagcgct gggccccacc 
ttgtcgcccc actgctgtcc atgccgggcc 
cttccccggc tgcaccagct gtgctgccgc 
gctgagctgt gcagctggat ggcactcatg 
ggcgaggggc tgtaccaggg cgtgcccaga 
tatgatgaag gcgttcggat gggcagcctg 
gtcttctctc tggtcatgga ccggctggtg 
gccagtgtgg cagctttccc tgtggctgcc 
gtggtgacag cttcagccgc cctcaccggg 
tacacactgg cctccctcta ccaccgggag 
gacactggag gtgctagcag tgaggacagc 



agactgctaa gacaccgcaa agctcagttg 60 
gaagtctgtt tggcggccgc ttacgtacat 120 
gagaaaatca atggcacttg gtactacttt 180 
tggaggaagc acacagacgg caactggtac 240 
ggctggaaga aaatcgctga taagtggtac 300 
ggctgggtca agtacaagga cacttggtac 360 
tacatcaagg ctaactctaa gttcattggt 420 
tttatccagt cagcggacgg aacaggctgg 480 
gacaggccag aaaagttcat gtacatggtg 540 
tgtgtcccgc tcctaggctc agccagtgac 600 
cccttcatct gggcactgtc cttgggcatc 660 
ggctggctag cagggctgct gtgcccggat 720 
ctgggcgtgg ggctgctgga cttctgtggc 780 
ctotctgacc tcttccggga cccggaccac 840 
atgatcagtc ttgggggctg cctgggctac 900 
gccctggccc cctacctggg cacccaggag 960 
ttcctcacct gcgtagcagc cacactgctg 1020 
gagccagcag aagggctgtc ggccccctcc 1080 
cgcttggctt tccggaacct gggcgccctg 1140 
atgccccgca ccctgcgccg gctcttcgtg 1200 
accttcacgc tgttttacac ggatttcgtg 1260 
gctgagccgg gcaccgaggc ccggagacac 1320 
gggctgttcc tgcagtgcgc catctccctg 1380 
cagcgattcg gcactcgagc agtctatttg 1440 
ggtgccacat gcctgtccca cagtgtggcc 1500 
ttcaccttct cagccctgoa gatcctgccc 1560 
aagcaggtgt tcctgcccaa ataccgaggg 1620 
ctgatgacca gcttcctgcc aggccctaag 1680 



cctggagctc ccttccctaa tggacacgtg 

ccacccgcgc tctgcggggc ctctgcctgt 

cccaccgagg ccagggtggt tccgggccgg 

agtgccttcc tgctgtccca ggtggcccca 

agccagtctg tcactgccta tatggtgtct 

tttgctacac aggtagtatt tgacaagagc 

catcaccatc accattaa 



ggtgctggag gcagtggcct gctcccacct 1740 
gatgtctccg tacgtgtggt ggtgggtgag 1800 
ggcatctgcc tggacctcgc catcctggat 1860 
tccctgttta tgggctccat tgtccagctc 1920 
gccgcaggcc tgggtctggt cgccatttac 1980 
gacttggcca aatactcagc gggtggacac 2040 
2058 



<210> 33 
<211> 671 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> St. pneum. C-LytA portion fused to P2 T helper epitope fused to Human P501S (amino acids 55-553) fused 
to 6 histidine residues downstream of yeast alphaprepro signal sequence 

<400> 33 



57 



EP1 511 768 B1 



Met 


Ala 


Ala 


Arg 


Phe 


Pro 


Ser 


He Phe 


Thr 


Ala 


Val 


Leu 


Phe 


Ala 


Ala 


1 








5 








10 










15 




Ser 


Ser 


Ala 


Leu 


Ala 


Ala 


Ala 


Tyr Val 


His 


Ser 


Asp 


Gly 


Ser 


Tyr 


Pro 








20 








2S 










30 






Lys 


Asp 


Lys 


Phe 


Glu 


Lys 


He 


Asn Gly 


Thr 


Trp 


Tyr 


Tyr 


Phe 


Asp 


Ser 






35 










40 








45 








Ser 


Gly 


Tyr 


Met 


Leu 


Ala 


Asp 


Arg Trp 


Arg 


Lys 


His 


Thr 


Asp 


Gly 


Asn 




50 










55 








60 










Trp 


Tyr 


Trp 


Phe 


Asp 


Asn 


Ser 


Gly Glu 


Met 


Ala 


Thr 


Gly 


Trp 


Lys 


Lys 


65 










70 








75 










80 


lie 


Ala 


Asp 


Lys 


Trp 


Tyr 


Tyr 


Phe Asn 


Glu 


Glu 


Gly 


Ala 


Met 


Lys 


Thr 










85 








90 










95 




Gly 


Trp 


Val 


Lys 


Tyr 


Lys 


Asp 


Thr Trp 


Tyr 


Tyr 


Leu 


Asp 


Ala 


Lys 


Glu 








100 








105 










110 






Gly 


Ala 


Met 


Gin 


Tyr 


He 


Lys 


Ala Asn 


Ser 


Lys 


Phe 


He 


Gly 


He 


Thr 






115 










120 








125 








Glu 


Gly 


Val 


Met 


Val 


Ser 


Asn 


Ala Phe 


He 


Gin 


Ser 


Ala 


Asp 


Gly 


Thr 




130 










135 








140 










Gly 


Trp 


Tyr 


Tyr 


Leu 


Lys 


Pro 


Asp Gly 


Thr 


Leu 


Ala 


Asp 


Arg 


Pro 


Glu 


14 5 










150 








155 










160 


Lys 


Phe 


Met 


Tyr 


Met 


Val 


Leu 


Gly He 


Gly 


Pro 


Val 


Leu 


Gly 


Leu 


Val 










165 








170 










175 




Cys 


Val 


Pro 


Leu 


Leu 


Gly 


Ser 


Ala Ser 


Asp 


His 


Trp 


Arg 


Gly 


Arg 


Tyr 








180 








185 










190 






Gly 


Arg 


Arg 


Arg 


Pro 


Phe 


He 


Trp Ala 


Leu 


Ser 


Leu 


Gly 


He 


Leu 


Leu 






195 










200 








205 








Ser 


Leu 


Phe 


Leu 


He 


Pro 


Arg 


Ala Gly 


Trp 


Leu 


Ala 


Gly 


Leu 


Leu 


Cys 




210 










215 








220 










Pro 


Asp 


Pro 


Arg 


Pro 


Leu 


Glu 


Leu Ala 


Leu 


Leu 


He 


Leu 


Gly 


Val 


Gly 


225 










230 








235 










240 


Leu 


Leu 


Asp 


Phe 


Cys 


Gly 


Gin 


Val Cys 


Phe 


Thr 


Pro 


Leu 


Glu 


Ala 


Leu 










245 








250 










255 




Leu 


Ser 


Asp 


Leu 


Phe 


Arg 


Asp 


Pro Asp 


His 


Cys 


Arg 


Gin 


Ala 


Tyr 


Ser 








260 








265 










270 






Val 


Tyr 


Ala 






lie 




Leu Gly 


Gly 


Cys 




Gl 


Tyr 










275 










280 








285 








Pro 


Ala 


He 


Asp 


Trp 


Asp 


Thr 


Ser Ala 


Leu 


Ala 


Pro 


Tyr 


Leu 


Gly 


Thr 




290 










295 








300 










Gin 


Glu 


Glu 


Cys 


Leu 


Phe 


Gly 


Leu Leu 


Thr 


Leu 


He 


Phe 


Leu 


Thr 


Cys 


305 










310 








315 










320 


Val 


Ala 


Ala 


Thr 


Leu 


Leu 


Val 


Ala Glu 


Glu 


Ala 


Ala 


Leu 


Gly 


Pro 


Thr 



45 



50 
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325 
















Glu 


Pro Ala Glu 


Gly 


Leu 


Ser 


Ala 


Pro Ser 


Leu Ser Pro His 


Cvs 
•ys 


Cv 
-ys 




340 










345 


350 






Pro 




A 


Leu 


Ala 


Phe 


Arg Asn 




Leu 


Pro 




^ 355 








360 




365 






rg 


Leu His Gin 


Leu 


Cvs 


Cys 


Arg 


Met P 
e ro 


380 


9 


Leu 
eu 




370 






375 












Phe 


Val Ala Glu 


Leu 


Cvs 


Ser 




Met Ala 


Leu Met Thr Phe 


Thr 


Le 


385 






390 












400 


Phe 


Tvr Thr Aep 
yr r 


Phe 


Val 


Gl 


Glu 


Gl L 


Tyr Gin Gly val 




3 






405 








7 410 




415 




a 


n p cl 


Th 


U 


Al 


9 




Tv A Gl Gl 


a 


Arg 




U r ° 420 


















Met 
e 


Gl Ser Leu 


Gl 


Leu 


Ph 


Leu 


Gin C 
n ys 


Ala lie Ser Le 


Val 
a 


Ph 

e 




y 435 6U 








440 




445 






Ser 


Leu Val Met 


sp 


rg 


Leu 


Val 


Gin Ar 
n g 


Phe Gly Thr Arg 


Ala 


Val 
3 




450 






455 






460 






TV 


Leu Ala Ser 


Val 


Ala 


Ala 


Phe 


Pro Val 


Ala Ala Gly Ala 


Thr 




465 






470 












480 


eu 


H' Se 
er is r 


v 1 




Val 


V 1 


Th Al 


S* 75 Ala Al 
er a a eu 


Th 


- 

Y 






485 








r 490 




4 95 




6 






eu 


n 


6 


p 


Tvr TH t Al 


er 


eu 




r S 500 










505 r ° 


r SU 510 






Tyr 


H A Gl 
15 U 


ys 






Phe 


T 

eu ro 


ys Tyr g y 


sp 


Th 

r 






















y 


Gl Ala Ser 


Ser 
er 


U 




Ser 
er 


Leu Met 
eu e 


Thr Ser Phe Leu 


ro 


y 




530 3 ^ 






535 






r 540 6 eU 








ys ro y 


Al 

a 




Ph 


ro 


A Gl 
sn y 


His Val Gly Ala 


y 




545 






550 












560 


Ser 
er 


Gl L L 
y eu eu 


Pro 


Pro 
ro 


Pro 
ro 


Pro 
ro 


Ala L 


Cvs Gl Al Se 
y a er 


Al 


ys 






565 








3 570 




575 




Asp 


Val Ser Val 


rg 




3 


3 


Gly Glu 


Pro Thr Glu Ala 


Arg 


a 






















3 


Pro Gly Arg 


Y 




Cys 




p eu 




er 


a 












600 




* 6 605 P 






Phe 


Leu Leu Ser 


Gin 


Val 


Ala 


Pro 


Ser Leu 


Phe Met Gly Ser 


He 


Val 




610 






615 






620 






Gin 


Leu Ser Gin 


Ser 


Val 


Thr 


Ala 


Tyr Met 


Val Ser Ala Ala 


Gly 


Leu 


625 






630 








635 




640 


Gly 


Leu Val Ala 


lie 


Tyr 


Phe 


Ala 


Thr Gin 


Val Val Phe Asp 


Lys 


Ser 






645 








650 




655 




Asp 


Leu Ala Lys 


Tyr 


Ser 


Ala 


Gly 


Gly His 


His His His His 


His 






660 










665 


670 







<210> 34 
<211>2477 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> DNA encoding St. pneum. C-LytA portion fused to P2 T helper epitope fused to Human P501S (amino acids 
55-553) fused to 6 histidine residues downstream of yeast aiphaprepro signal sequence 



55 <400> 34 
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tacgtacatt ccgacggctc ttatccaaaa 
tactactttg acagttcagg ctatatgctt 
aactggtact ggttcgacaa ctcaggcgaa 
aagtggtact atttcaacga agaaggtgcc 



gacaagtttg agaaaatcaa tggcacttgg 60 
gcagaccgct ggaggaagca cacagacggc 120 
atggctacag gctggaagaa aatcgctgat 180 
atgaagacag gctgggtcaa gtacaaggac 240 



acttggtact acttagacgc taaagaaggc 
ttcattggta tcactgaagg cgtcatggta 
acaggctggt actacctcaa accagacgga 
agatttcctt caatttttac tgcagtttta 
tacgtacatt ccgacggctc ttatccaaaa 
tactactttg acagttcagg ctatatgctt 
aactggtact ggttcgacaa ctcaggcgaa 
aagtggtact atttcaacga agaaggtgcc 
acttggtact acttagacgc taaagaaggc 
ttcattggta tcactgaagg cgtcatggta 
acaggctggt actacctcaa accagacgga 
acttacgttc caccattgtt gttggaagtt 
ctgggcattg gtccagtgct gggcctggtc 
cactggcgtg gacgctatgg ccgccgccgg 
ctgctgagcc tctttctcat cccaagggcc 
cccaggcccc tggagctggc actgctcatc 
caggtgtgct tcactccact ggaggccctg 
tgtcgccagg cctactctgt ctatgcttca 
tcctgcctgc cattgactgg gacaccagtg 
agtgcctctt tggcctgctc accctcatct 
tggctgagga ggcagcgctg ggccccaccg 
tgtcgcccca ctgctgtcca tgccgggccc 
ttccccggct gcaccagctg tgctgccgca 
ctgagctgtg cagctggatg gcactcatga 
gcgaggggct gtaccagggc gtgcccagag 
atgatgaagg cgttcggatg ggcagcctgg 
tcttctctct ggtcatggac cggctggtgc 
ccagtgtggc agctttccct gtggctgccg 
tggtgacagc ttcagccgcc ctcaccgggt 
acacactggc ctccctctac caccgggaga 
acactggagg tgctagcagt gaggacagcc 
ctggagctcc cttccctaat ggacacgtgg 
cacccgcgct ctgcggggcc tctgcctgtg 
ccaccgaggc cagggtggtt ccgggccggg 
gtgccttcct gctgtcccag gtggccccat 
gccagtctgt cactgcctat atggtgtctg 
ttgctacaca ggtagtattt gacaagagcg 
atcaccatca ccattaa 



gccatgcaat acatcaaggc taactctaag 300 
tcaaatgcct ttatccagtc agcggacgga 360 
acactggcag acaggccaga aatggcggcc 420 
ttcgcagcat cctccgcatt agcggccgct 480 
gacaagtttg agaaaatcaa tggcacttgg 54 0 
gcagaccgct ggaggaagca cacagacggc 600 
atggctacag gctggaagaa aatcgctgat 660 
atgaagacag gctgggtcaa gtacaaggac 720 
gccatgcaat acatcaaggc taactctaag 780 
tcaaatgcct ttatccagtc agcggacgga 840 
acactggcag acaggccaga agctggtatt 900 
ggtgttgaag aaaagttcat gtacatggtg 960 
tgtgtcccgc tcctaggctc agccagtgac 1020 
cccttcatct gggcactgtc cttgggcatc 1080 
ggctggctag cagggctgct gtgcccggat 1140 
ctgggcgtgg ggctgctgga cttctgtggc 1200 
ctctctgacc tcttccggga cccggaccac 1260 
tgatcagtct tgggggctgc ctgggctacc 1320 
ccctggcccc ctacctgggc acccaggagg 1380 
tcctcacctg cgtagcagcc acactgctgg 1440 
agccagcaga agggctgtcg gccccctcct 1500 
gcttggcttt ccggaacctg ggcgccctgc 1560 
tgccccgcac cctgcgccgg ctcttcgtgg 1620 
ccttcacgct gttttacacg gatttcgtgg 1680 
ctgagccggg caccgaggcc cggagacact 1740 
ggctgttcct gcagtgcgcc atctccctgg 1800 
agcgattcgg cactcgagca gtctatttgg 1860 
gtgccacatg cctgtcccac agtgtggccg 1920 
tcaccttctc agccctgcag atcctgccct 1980 
agcaggtgtt cctgcccaaa taccgagggg 2040 
tgatgaccag cttcctgcca ggccctaagc 2100 
gtgctggagg cagtggcctg ctcccacctc 2160 
atgtctccgt acgtgtggtg gtgggtgagc 2220 
gcatctgcct ggacctcgcc atcctggata 2280 
ccctgtttat gggctccatt gtccagctca 2340 
ccgcaggcct gggtctggtc gccatttact 2400 
acttggccaa atactcagcg ggtggacacc 2460 
2477 



<210> 35 
<211>595 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Human P501S (amino acids 55-553) fused to 6 histidine residues downstream of yeast aiphaprepro signal 
sequence 
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Met 


Ser 


Phe Leu Asn 


Phe 


Thr Ala 


val 


Leu 


Phe Ala 


Ala Ser 


Ser 


Ala 














10 






15 




Leu 


Ala 


Ala Pro Val 


Asn 


Thr Thr 


Thr 


Glu 


Asp Glu 


Thr Ala 


Gin 


He 
























Pro 


Ala 


Glu Ala Val 


He 


Gly Tyr 


Ser 


Asp 


Leu Glu 


Gly Asp 


Phe Asp 






35 




40 








4S 






Val 


Ala 


Val Leu Pro 


Phe 


Ser Asn 


Ser 


Thr 


Asn Asn 


Gly Leu 


Leu 


Phe 




50 






55 






60 






lie 


Asn 


Thr Thr He 


Ala 


Ser He 


Ala 


Ala 


Lys Glu 


Glu Gly 


Val 


Ser 


65 






70 








75 






80 


Leu 


Glu 


Lys Arg Glu 


Ala 


Glu Ala 


Met 


Val 


Leu Gly 


He Gly 


Pro 


Val 
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85 












95 


Leu 


Glv 
y 


Leu 


Val 


Cys Val 


Pro Leu Leu 


Glv 


Ser 


Ala Ser 


Asp 


His Trp 




















110 




A 


Glv 


Arg 


Tvr 


Gly Arq 


Arg Arg Pro 


Phe 


He 


Trp Ala 




Ser Leu 






115 






120 






125 






Glv 


lie 




Leu 


Ser Leu 


Phe Leu He 


Pro 


Arq 


140 


Trp 


Leu Ala 












135 












Glv 


Leu 


Leu 


Cvs 


Pro Asp 


Pro Arg Pro 


Leu 


Glu 


Leu Ala 


Leu 


Leu He 


145 








150 






155 






160 


Leu 


Glv 


Val 


Glv 

y 


Leu Leu 


Asp Phe Cvs 


Glv 


Gin 


Val Cys 


Phe 


Thr Pro 










165 




170 








175 


Leu 


Glu 


Ala 


Leu 


Leu Ser 


Asp Leu Phe 


Ar 

9 


sp 


Pro Asp 
o p 


His 


Cvs Ar 
Y 9 








180 




185 












n 


Ala 


TV 


Ser 


Val Tvr 
a Tyr 


Ala Phe Met 


He 


Ser 


Leu Gl 


Glv 
y 


Cvs Leu 






195 












205 






Gl 




Leu 


Leu 


Pro Ala 


He Asp Trp 


P 


Thr 


Ser Ala 


Leu 


Ala Pro 




210 














220 








Leu 


Gl 

y 


Thr 


Gin Glu 


Glu Cvs Leu 


Phe 


Gl 


Leu Leu 


Thr 


Leu He 


225 














235 






240 


Phe 


Leu 


Thr 


ys 


Val Ala 


Ala Thr Leu 


Leu 


Val 


Ala Glu 


Glu 


Ala Ala 










245 




250 








255 


eu 


Gl 

y 




Th 


Gl 

u ro 


Ala Glu Gly 






Al Pr 




L 








260 












270 




ro 


. 

is 


Cys 




r cys 


Arg Ala Arg 


eu 




Phe Arg 


sn 


Gl 
u y 






275 






280 






285 






Ala 


Leu 


Leu 


Pro 


Arq Leu 


His Gin Leu 


Cvs 


Cvs 

y 


300 


Pro 


Arq Thr 
9 




290 
























r 9 


Leu 


Phe Val 


Ala Glu Leu 


ys 


Ser 


Trp Met 

*P 


Ala 


Leu Met 


305 








310 






315 






320 


Thr 


Phe 


Thr 


Leu 


Ph Tvr 


Thr A Phe 


Val 


Gl 


Glu Gly 


Leu 


Tvr Gin 










325 




330 








335 


y 


V 

a 


ro 




Ala Glu 


Gl Th 


Gl 


Ala 


Ar A 
9 r 9 


. 


Tv A 
xyr sp 








3 40 












350 




U 


y 


Val 


9 


Met Gl 

* 


Ser L u Gl 


Leu 


Phe 


Leu Gin 


y 


Ala He 






355 






360 






365 






Ser 


Leu 


Val 


Phe 


Ser Leu 


Val Met Asp 




Leu 


Val Gin 


9 


Phe Glv 
y 




370 














380 






Thr 


9 


Ala 


Val 


TV Leu 


Ala Ser Val 


Ala 


Ala 


Ph Pr 
e o 


Val 


Ala Al 


385 








3 90 






395 






3 400 


Glv 


Ala 


Thr 


Cvs 


Leu Ser 


His Ser Val 


Ala 


Val 


Val Thr 


Ala 


Ser Ala 










405 




410 








415 


Ala 


L u 
eu 


Thr 


Glv 


Phe Thr 


Phe Ser Ala 


Leu 
eu 


Gin 


He Leu 


Pro 


Tvr Thr 








420 












430 




Leu 
eu 


Ala 


Ser 


Leu 
eu 


TV His 
xyr is 


AT Glu L^s 


Gin 


Val 
a 


Phe Leu 


Pro 
ro 


TV 
ys Tyr 






435 






9 440 ^ 






6 445 






Arq 


Gly 


Abd 


Thr 


Gly Gly 


Ala Ser Ser 


Glu 


Asp 


Ser Leu 


Met 


Thr Ser 




450 








455 






460 






Phe 


Leu 


Pro 


Glv 


Pro Lvs 


Pro Gly Ala 


Pro 


Phe 


Pro Asn 


Glv 
y 


His Val 


465 








470 






475 






480 


Gly 


Ala 


Gly 


Gl 


Ser Gl 


Leu Leu Pro 


Pro 


Pro 


Pro Ala 


Leu 


m 










455 












49"; 




er 


3 




V 3 

sp a 


VIA 


VI 

a 


V 1 


v 1 Gl 
a y 




Th 

ro r 








500 




Gr 3 505 








510 




Glu 


Ala 


Arg 


Val 


Val Pro 


Gly Arg Gly 


He 


Cys 


Leu Asp 


Leu 


Ala He 






515 






520 






525 






Leu 


Asp 


Ser 


Ala 


Phe Leu 


Leu Ser Gin 


Val 


Ala 


Pro Ser 


Leu 


Phe Met 




530 








535 






540 






Gly 


Ser 


lie 


val 


Gin Leu 


Ser Gin Ser 


val 


Thr 


Ala Tyr 


Met 


Val Ser 


545 








S50 






555 






560 


Ala 


Ala 


Gly 


Leu 


Gly Leu 


Val Ala He 


Tyr 


Phe 


Ala Thr 


Gin 


Val Val 










565 




570 








575 



62 
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Phe Asp Lys Ser Asp Leu Ala Lys Tyx Ser Ala Gly Gly His His Hie 
S80 585 590 

His His His 
595 



<210> 36 
<211> 1788 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> DNA encoding Human P501S (amino acids 55-553) fused to 6 histidine residues downstream of yeast 
alphaprepro signal sequence 

<400> 36 



atgagtttcc tcaattttac tgcagtttta ttcgcagcat cctccgcatt agctgctcca 60 
gtcaacacta caacagaaga tgaaacggca caaattccgg ctgaagctgt catcggttac 120 
tcagatttag aaggggattt cgatgttgct gttttgccat tttccaacag cacaaataac 180 
gggttattgt ttataaatac tactattgcc agcattgctg ctaaagaaga aggggtatct 240 
ctcgagaaaa gagaggctga agccatggtg ctgggcattg gtccagtgct gggcctggtc 300 
tgtgtcccgc tcctaggctc agccagtgac cactggcgtg gacgctatgg ccgccgccgg 360 
cccttcatct gggcactgtc cttgggcatc ctgctgagcc tctttctcat cccaagggcc 420 
ggctggctag cagggctgct gtgcccggat cccaggcccc tggagctggc actgctcatc 480 
ctgggcgtgg ggctgctgga cttctgtggc caggtgtgct tcactccact ggaggccctg 540 
ctctctgacc tcttccggga cccggaccac tgtcgccagg cctactctgt ctatgccttc 600 
atgatcagtc ttgggggctg cctgggctac ctcctgcctg ccattgactg ggacaccagt 660 
gccctggccc cctacctggg cacccaggag gagtgcctct ttggcctgct caccctcatc 720 
ttcctcacct gcgtagcagc cacactgctg gtggctgagg aggcagcgct gggccccacc 780 
gagccagcag aagggctgtc ggccccctcc ttgtcgcccc actgctgtcc atgccgggcc 840 
cgcttggctt tccggaacct gggcgccctg cttccccggc tgcaccagct gtgctgccgc 900 
atgccccgca ccctgcgccg gctcttcgtg gctgagctgt gcagctggat ggcactcatg 960 
accttcacgc tgttttacac ggatttcgtg ggcgaggggc tgtaccaggg cgtgcccaga 1020 
gctgagccgg gcaccgaggc ccggagacac tatgatgaag gcgttcggat gggcagcctg 1080 
gggctgttcc tgcagtgcgc catctccctg gtcttctctc tggtcatgga ccggctggtg 1140 
cagcgattcg gcactcgagc agtctatttg gccagtgtgg cagctttccc tgtggctgcc 1200 
ggtgccacat gcctgtccca cagtgtggcc gtggtgacag cttcagccgc cctcaccggg 1260 
ttcaccttct cagccctgca gatcctgccc tacacactgg cctccctcta ccaccgggag 1320 
aagcaggtgt tcctgcccaa ataccgaggg gacactggag gtgctagcag tgaggacagc 1380 
ctgatgacca gcttcctgcc aggccctaag cctggagctc ccttccctaa tggacacgtg 1440 
ggtgctggag gcagtggcct gctcccacct ccacccgcgc tctgcggggc ctctgcctgt 1500 
gatgtctccg tacgtgtggt ggtgggtgag cccaccgagg ccagggtggt tccgggccgg 1560 
ggcatctgcc tggacctcgc catcctggat agtgccttcc tgctgtccca ggtggcccca 1620 
tccctgttta tgggctccat tgtccagctc agccagtctg tcactgccta tatggtgtct 1680 
gccgcaggcc tgggtctggt cgccatttac tttgctacac aggtagtatt tgacaagagc 1740 
gacttggcca aatactcagc gggtggacac catcaccatc accattaa 1788 



<210> 37 
<211> 1955 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> DNA encoding codon-optimised Human P501S (amino acids 51-553) fused to St.pneum. C-LytA P2 helper 
epitope C-Lyta 
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EP1 511 768 B1 



<400> 37 



gcggccgcgc caccatggcc gccgcctacg tgcatagcga cgggagctac cccaaggaca 60 
agttcgagaa gatcaacggg acatggtact acttcgactc ctccggctac atgctcgccg 120 



accgctggcg gaagcacacc gacggcaact 
ccaccggctg gaagaagatc gcggacaagt 
agaccggctg ggtgaagtat aaggacacct 
tgcagtatat caaggccaac agcaagttca 
acgcctttat ccagagcgcc gacggcaccg 
tcgcggatcg gcccgagaag ttcatgtaca 
tcgtgtgtgt gcccctcctc gggagtgcgt 
gcagaccgtt catctgggcc ctgagcctgg 
gggccggctg gctggccggc ctgctgtgtc 
tgatcctggg cgtgggcctg ctggacttct 
ctctgctctc cgacctcttc cgcgaccccg 
ccttcatgat cagtctgggg ggatgcctgg 
ccagcgccct ggccccctac ctggggactc 
tgatcttcct gacgtgcgtc gccgccaccc 
ccaccgagcc cgccgagggc ctgagcgctc 
gggctaggct cgccttcagg aatctgggcg 
gtcgcatgcc tcgcaccctg cgccgcctgt 
tgatgacgtt caccctcttc tacaccgact 
ccagggccga gcccggcacc gaggctaggc 
ctctgggcct cttcctgcag tgcgccatca 
tggtgcagcg cttcggcacc cgggccgtgt 
ccgccggcgc gacctgcctg tctcattctg 
ccggcttcac cttcagtgcg ctccagattc 
gcgagaagca ggtgttcctg cccaagtacc 
acagcctgat gaccagcttc ttgcccggcc 
atgtcggggc gggcggcagc ggcctgctcc 
cctgcgacgt gagcgtgcgg gtggtggtgg 
gccgggggat ctgcctggac ctggccatcc 
cgcccagcct gttcatgggc agtatcgtgc 
tgagcgccgc cggcctgggg ttggtggcca 
agagcgatct cgccaagtat agcgcctgag 



ggtactggtt cgataactcg ggagagatgg 180 
ggtactattt caacgaggag ggcgccatga 240 
ggtactacct cgacgccaag gagggcgcca 300 
tcggcatcac cgagggagtg atggtcagca 360 
gatggtacta cttgaagccg gacggcaccc 420 
tggtgctggg catcggcccc gtcctgggcc 480 
ccgatcattg gcggggccgc tacggccgcc 540 
gcatcctgct ctctctcttc ctgatccccc 600 
ccgacccccg ccctctggag ctggccctcc 660 
gcggccaggt gtgtttcact cccctggagg 720 
accactgtag gcaggcttac agcgtgtacg 780 
gctatctgct gcccgctatc gactgggaca 840 
aggaggagtg cctgttcggc ctgctcacct 900 
tgctggtggc cgaggaggcg gccctggggc 960 
ccagcctgag cccccattgc tgcccgtgca 1020 
ctttgctgcc ccgcctgcat cagctgtgct 1080 
tcgtcgctga gctctgttcc tggatggccc 1140 
tcgtggggga gggcctgtac cagggcgtgc 1200 
gccattacga cgagggcgtc aggatgggct 1260 
gtctggtgtt ctctctggtg atggaccggc 1320 
acctcgcctc tgtggcggct ttccccgtcg 1380 
tcgccgtggt gaccgccagc gccgccctga 144 0 
tgccctacac cctggcgtct ctgtaccatc 1500 
gcggggacac agggggagct tcctctgagg 1560 
ccaagccggg ggcccctttc cccaacggcc 162 0 
ctcccccccc cgccctgtgc ggcgctagtg 1680 
gggagcccac cgaggctagg gtcgtgcctg 1740 
tcgactccgc cttcctgctc tcccaggtgg 1800 
agctgagcca gagcgtgacc gcctacatgg 1860 
tctactttgc cacccaggtc gtgttcgaca 1920 
gatcc 1955 



<210> 38 
<211>2045 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> DNA encoding codon-optimised Human P501S (amino acids 1-553) fused to St.pneum. C-LytA P2 helper 
epitope C-Lyta 

<400> 38 
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gcggccgcgc caccatggcc gccgcctacg 
agttcgagaa gatcaacggg acatggtact 
accgctggcg gaagcacacc gacggcaact 
ccaccggctg gaagaagatc gcggacaagt 
agaccggctg ggtgaagtat aaggacacct 
tgcagtatat caaggccaae agcaagttca 
acgcctttat ccagagcgcc gacggcaccg 
tcgcggatcg gcccgagatg gtgcagcggc 
aggcccagtt gctgctggtg aacctgctga 
tggtgctggg catcggcccc gtcctgggcc 
ccgatcattg gcggggccgc tacggccgcc 
gcatcctgct ctctctcttc ctgatccccc 
ccgacccccg ccctctggag ctggccctcc 
gcggccaggt gtgtttcact cccctggagg 
accactgtag gcaggcttac agcgtgtacg 
gctatctgct gcccgctatc gactgggaca 
aggaggagtg cctgttcggc ctgctcacct 
tgctggtggc cgaggaggcg gccctggggc 
ccagcctgag cccccattgc tgcccgtgca 



tgcatagcga cgggagotac cccaaggaca 60 
acttcgactc ctccggctac atgctcgccg 120 
ggtactggtt cgataactcg ggagagatgg 180 
ggtactattt caacgaggag ggcgccatga 24 0 
ggtactacct cgacgccaag gagggcgcca 300 
tcggcatcac cgagggagtg atggtcagca 360 
gatggtacta cttgaagccg gacggcaccc 420 
tgtgggtgtc ccggctgctg cgccatagaa 480 
ctttcggact ggaggtgtgc ctggctgccg 540 
tcgtgtgtgt gcccctcctc gggagtgcgt 600 
gcagaccgtt catctgggcc ctgagcctgg 660 
gggccggctg gctggccggc ctgctgtgtc 720 
tgatcctggg cgtgggcctg ctggacttct 780 
ctctgctctc cgacctcttc cgcgaccccg 840 
ccttcatgat cagtctgggg ggatgcctgg 900 
ccagcgccct ggccccctac ctggggactc 960 
tgatcttcct gacgtgcgtc gccgccaccc 1020 
ccaccgagcc cgccgagggc ctgagcgctc 1080 
gggctaggct cgccttcagg aatctgggcg 1140 



ctttgctgcc ccgcctgcat cagctgtgct 
tcgtcgctga gctctgttcc tggatggccc 
tcgtggggga gggcctgtac cagggcgtgc 
gccattacga cgagggcgtc aggatgggct 
gtctggtgtt ctctctggtg atggaccggc 
acctcgcctc tgtggcggct ttccccgtcg 
tcgccgtggt gaccgccagc gccgccctga 
tgocctacac cctggcgtct ctgtaccatc 
gcggggacac agggggagct tcctctgagg 
ccaagccggg ggcccctttc cccaacggcc 
ctcccccccc cgccctgtgc ggcgctagtg 
gggagcccac cgaggctagg gtcgtgcctg 
tcgactccgc cttcctgctc tcccaggtgg 
agctgagcca gagcgtgacc gcctacatgg 
tctactttgc cacccaggtc gtgttcgaca 
gatcc 



gtcgcatgcc tcgcaccctg cgccgcctgt 1200 
tgatgacgtt caccctcttc tacaccgact 1260 
ccagggccga gcccggcacc gaggctaggc 1320 
ctctgggcct cttcctgcag tgcgccatca 1380 
tggtgcagcg cttcggcacc cgggccgtgt 1440 
ccgccggcgc gacctgcctg tctcattctg 1500 
ccggcttcac cttcagtgcg ctccagattc 1560 
gcgagaagca ggtgttcctg cccaagtacc 1620 
acagcctgat gaccagcttc ttgcccggcc 1680 
atgtcggggc gggcggcagc ggcctgctcc 1740 
cctgcgacgt gagcgtgcgg gtggtggtgg 1800 
gccgggggat ctgcctggac ctggccatcc i860 
cgcccagcct gttcatgggc agtatcgtgc 1920 
tgagcgccgc cggcctgggg ttggtggcca 1980 
agagcgatct cgccaagtat agcgcctgag 2040 
2045 



<210> 39 
<211> 2105 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> DNA encoding St.pneum. C-LytA P2 helper epitope C-Lyta fused to Human P501S (amino acids 51-553) 
fused to Human P501S (amino acids 1-50) - Codon-optimised 



<400> 39 
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gcggccgcgc caccatggcc gccgcctacg 
agttcgagaa gatcaacggg acatggtact 
accgctggcg gaagcacacc gacggcaact 
ccaccggctg gaagaagatc gcggacaagt 
agaccggctg ggtgaagtat aaggacacct 
tgcagtatat caaggccaac agcaagttca 
acgcctttat ccagagcgcc gacggcaccg 
tcgcggatcg gcccgagaag ttcatgtaca 
tcgtgtgtgt gcccctcctc gggagtgcgt 
gcagaccgtt catctgggcc ctgagcctgg 
gggccggctg gctggccggc ctgctgtgtc 
tgatcctggg cgtgggcctg ctggacttct 
ctctgctctc cgacctcttc cgcgaccccg 
ccttcatgat cagtctgggg ggatgcctgg 
ccagcgccct ggccccctac ctggggactc 
tgatcttcct gacgtgcgtc gccgccaccc 
ccaccgagcc cgccgagggc ctgagcgctc 
gggctaggct cgccttcagg aatctgggcg 
gtcgcatgcc tcgcaccctg cgccgcctgt 
tgatgacgtt caccctcttc tacaccgact 
ccagggccga gcccggcacc gaggctaggc 
ctctgggcct cttcctgcag tgcgccatca 
tggtgcagcg cttcggcacc cgggccgtgt 
ccgccggcgc gacctgccfcg tctcattctg 
ccggcttcac cttcagtgcg ctccagattc 
gcgagaagca ggtgttcctg cccaagtacc 
acagcctgat gaccagcttc ttgcccggcc 
atgtcggggc gggcggcagc ggcctgctcc 
cctgcgacgt gagcgtgcgg gtggtggtgg 
gccgggggat ctgcctggac ctggccatcc 
cgcccagcct gttcatgggc agtatcgtgc 
tgagcgccgc cggcctgggg ttggtggcca 



tgcatagcga cgggagctac cccaaggaca 60 
acttcgactc ctccggctac atgctcgccg 120 
ggtactggtt cgataactcg ggagagatgg 180 
ggtactattt caacgaggag ggcgccatga 240 
ggtactacct cgacgccaag gagggcgcca 300 
tcggcatcac cgagggagtg atggtcagca 360 
gatggtacta cttgaagccg gacggcaccc 420 
tggtgctggg catcggcccc gtcctgggcc 480 
ccgatcattg gcggggccgc tacggccgcc 540 
gcatcctgct ctctctcttc ctgatccccc 600 
ccgacccccg ccctctggag ctggccctcc 660 
gcggccaggt gtgtttcact cccctggagg 720 
accactgtag gcaggcttac agcgtgtacg 780 
gctatctgct gcccgctatc gactgggaca 840 
aggaggagtg cctgttcggc ctgctcacct 900 
tgctggtggc cgaggaggcg gccctggggc 960 
ccagcctgag cccccattgc tgcccgtgca 1020 
ctttgctgcc ccgcctgcat cagctgtgct 1080 
tcgtcgctga gctctgttcc tggatggccc 1140 
tcgtggggga gggcctgtac cagggcgtgc 1200 
gccattacga cgagggcgtc aggatgggct 1260 
gtctggtgtt ctctctggtg atggaccggc 1320 
acctcgcctc tgtggcggct ttccccgtcg 1380 
tcgccgtggt gaccgccagc gccgccctga 1440 
tgccctacac cctggcgtct ctgtaccatc 1500 
gcggggacac agggggagct tcctctgagg 1560 
ccaagccggg ggcccctttc cccaacggcc 1620 
ctcccccccc cgccctgtgc ggcgctagtg 1680 
gggagcccac cgaggctagg gtcgtgcctg 1740 
tcgactccgc cttcctgctc tcccaggtgg 1800 
agctgagcca gagcgtgacc gcctacatgg 1860 
tctactttgc cacccaggtc gtgttcgaca 1920 



agagcgatct cgccaagtat agcgccatgg 
gccatagaaa ggcccagttg ctgctggtga 
tggctgccgg gatcacgtac gtgccccccc 
gatcc 



tgcagcggct gtgggtgtcc cggctgctgc 1980 
acctgctgac tttcggactg gaggtgtgcc 2040 
tgctgctgga ggtgggcgtg gaggagtgag 2100 
2105 



<210>40 
<211> 2105 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> DNA encoding Human P501S (amino acids 1-50) fused to St.pneum. C-LytA P2 helper epitope C-Lyta fused 
to Human P501S (amino acids 51-553) - Codon-optimised 



<400> 40 
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gcggccgcgc caccatggtg cagcggctgt 
cccagttgct gctggtgaac ctgctgactt 
tcacgtacgt gccccccctg ctgctggagg 
tgcatagcga cgggagctac cccaaggaca 
acttcgactc ctccggctac atgctcgccg 
ggtactggtt cgataactcg ggagagatgg 
ggtactattt caacgaggag ggcgccatga 
ggtactacct cgacgccaag gagggcgcca 
tcggcatcac cgagggagtg atggtcagca 
gatggtacta cttgaagccg gacggcaccc 
tggtgctggg catcggcccc gtcctgggcc 
ccgatcattg gcggggccgc tacggccgcc 
gcatcctgct ctctctcttc ctgatccccc 
ccgacccccg ccctctggag ctggccctcc 
gcggccaggt gtgtttcact cccctggagg 
accactgtag gcaggcttac agcgtgtacg 
gctatctgct gcccgctatc gactgggaca 
aggaggagtg cctgttcggc ctgctcacct 
tgctggtggc cgaggaggcg gccctggggc 
ccagcctgag cccccattgc tgcccgtgca 
ctttgctgcc ccgcctgcat cagctgtgct 
tcgtcgctga gctctgttcc tggatggccc 
tcgtggggga gggcctgtac cagggcgtgc 
gccattacga cgagggcgtc aggatgggct 
gtctggtgtt ctctctggtg atggaccggc 
acctcgcctc tgtggcggct ttccccgtcg 
tcgccgtggt gaccgccagc gccgccctga 
tgccctacac cctggcgtct ctgtaccatc 
gcggggacac agggggagct tcctctgagg 
ccaagccggg ggcccctttc cccaacggcc 
ctcccccccc cgccctgtgc ggcgctagtg 
gggagcccac cgaggctagg gtcgtgcctg 
tcgactccgc cttcctgctc tcccaggtgg 
agctgagcca gagcgtgacc gcctacatgg 
tctactttgc cacccaggtc gtgttcgaca 
gatcc 



gggtgtcccg gctgctgcgc catagaaagg 60 

tcggactgga ggtgtgcctg gctgccggga 120 

tgggcgtgga ggagatggcc gccgcctacg 180 

agttcgagaa gatcaacggg acatggtact 240 

accgctggcg gaagcacacc gacggcaact 300 

ccaccggctg gaagaagatc gcggacaagt 360 

agaccggctg ggtgaagtat aaggacacct 420 

tgcagtatat caaggccaac agcaagttca 480 

acgcctttat ccagagcgcc gacggcaccg 540 

tcgcggatcg gcccgagaag ttcatgtaca 600 

tcgtgtgtgt gcccctcctc gggagtgcgt 660 

gcagaccgtt catctgggcc ctgagcctgg 720 

gggccggctg gctggccggc ctgctgtgtc 780 

tgatcctggg cgtgggcctg ctggacttct 840 

ctctgctctc cgacctcttc cgcgaccccg 900 

ccttcatgat cagtctgggg ggatgcctgg 960 

ccagcgccct ggccccctac . ctggggactc 1020 

tgatcttcct gacgtgcgtc gccgccaccc 1080 

ccaccgagcc cgccgagggc ctgagcgctc 1140 

gggctaggct cgccttcagg aatctgggcg 1200 

gtcgcatgcc tcgcaccctg cgccgcctgt 1260 

tgatgacgtt caccctcttc tacaccgact 1320 

ccagggccga gcccggcacc gaggctaggc 1380 

ctctgggcct cttcctgcag tgcgccatca 1440 

tggtgcagcg cttcggcacc cgggccgtgt 1500 

ccgccggcgc gacctgcctg tctcattctg 1560 

ccggcttcac cttcagtgcg ctccagattc 1620 

gcgagaagca ggtgttcctg cccaagtacc 1680 

acagcctgat gaccagcttc ttgcccggcc 1740 

atgtcggggc gggcggcagc ggcctgctcc 1800 

cctgcgacgt gagcgtgcgg gtggtggtgg 1860 

gccgggggat ctgcctggac ctggccatcc 1920 

cgcccagcct gttcatgggc agtatcgtgc 1980 

tgagcgccgc cggcctgggg ttggtggcca 2040 

agagcgatct cgccaagtat agcgcctgag 2100 
2105 



<210>41 
<21 1 > 652 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> St.pneum. C-LytA P2 helper epitope C-Lyta fused to Human P501S 
<400> 41 
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Met 


Ala 


Ala 


Ala 


Tyr 


Val 


His 


Ser 


Asp 


Gly Ser Tyr 


Pro 


Ly6 


Asp 


Lys 




1 








5 










10 






15 






Phe 


Glu 


Lys 


He 


Asn 


Gly 


Thr 


Trp 


Tyr 


Tyr Phe Asp 


Ser 


Ser 


Gly 


Tyr 








20 










25 






30 






5 


Met 


Leu 


Ala 
35 


Asp 


Arg 


Trp 


Arg 


Lys 
40 


His 


Thr Asp Gly 


Asn 
45 


Trp 


Tyr 


Trp 




Phe 


Asp 
50 


Asn 


Ser 


Gly 


Glu 


Met 
55 


Ala 


Thr 


Gly Trp Lys 
60 


Lys 


He 


Ala 


Asp 




Lys 


Trp 


Tyr 


Tyr 


Phe 


Asn 


Glu 


Glu 


Gly 


Ala Met Lys 


Thr 


Gly 


Trp 


Val 


10 


65 










70 








75 








80 




Lys 


Tyr 


Lys 


Asp 


Thr 
85 


Trp 


Tyr 


Tyr 


Leu 


Asp Ala Lys 
90 


Glu 


Gly 


Ala 
95 


Met 




Gin 


Tyr 


He 


Lys 
100 


Ala 


Asn 


Ser 


Lys 


Phe 
105 


He Gly He 


Thr 


Glu 
110 


Gly 


Val 


15 


Met 


Val 


Ser 


Asn 


Ala 


Phe 


He 


Gin 


Ser 


Ala Asp Gly 


Thr 


Gly 


Trp 


Tyr 






115 










120 






125 










Tyr 


Leu 
130 


Lys 


Pro 


Asp 


Gly 


Thr 
135 


Leu 


Ala 


Asp Arg Pro 
140 


Glu 


Lys 


Phe 


Met 




Tyr 


Met 


Val 


Leu 


Gly 


He 


Gly 


Pro 


Val 


Leu Gly Leu 


Val 


Cys 


Val 


Pro 


20 


14 5 










150 








155 








160 


Leu 


Leu 


Gly 


Ser 


Ala 
165 


Ser 


Asp 


His 


Trp 


Arg Gly Arg 
170 


Tyr 


Gly 


Arg 
175 


Arg 




Arg 


Pro 


Phe 


He 
180 


Trp 


Ala 


Leu 


Ser 


Leu 
185 


Gly He Leu 


Leu 


Ser 
190 


Leu 


Phe 




Leu 


He 


Pro 


Arg 


Ala 


Gly 


Trp 


Leu 


Ala 


Gly Leu Leu 


Cys 


Pro 


Asp 


Pro 


25 






195 










200 






205 










Arg 


Pro 


Leu 


Glu 


Leu 


Ala 


Leu 


Leu 


He 


Leu Gly Val 


Gly 


Leu 


Leu 


Asp 




210 










215 






220 












Phe 


Cys 


Gly 


Gin 


val 


Cys 


Phe 


Thr 


Pro 


Leu Glu Ala 


Leu 


Leu 


Ser 


Asp 




225 








230 








235 








240 


30 


Leu 


Phe 


Arg 


Asp 


Pro 


Asp 


His 


Cys 


Arg 


Gin Ala Tyr 


ser 


Val 


Tyr 


Ala 










245 










250 






255 






Phe 


Met 


He 


Ser 
260 


Leu 


Gly 


Gly 


Cys 


Leu 
265 


Gly Tyr Leu 


Leu 


Pro 
270 


Ala 


He 




Asp 


Trp 


Asp 


Thr 


Ser 


Ala 


Leu 


Ala 


Pro 


Tyr Leu Gly 


Thr 


Gin 


Glu 


Glu 








275 










280 






285 










Cys 


Leu 
290 


Phe 


Gly 


Leu 


Leu 


Thr 
295 


Leu 


He 


Phe Leu Thr 
300 


Cys 


Val 


Ala 


Ala 




Thr 


Leu 


Leu 


Val 


Ala 


Glu 


Glu 


Ala 


Ala 


Leu Gly Pro 


Thr 


Glu 


Pro 


Ala 




305 










310 








3X5 








320 


40 


Glu 


Gly 


Leu 


Ser 


Ala 


Pro 


Ser 


Leu 


Ser 


Pro His Cys 


Cys 


Pro 


Cys 


Arg 








325 










330 






335 






Ala 


Arg 


Leu 


Ala 
340 


Phe 


Arg 


Asn 


Leu 


Gly 
345 


Ala Leu Leu 


Pro 


Arg 
350 


Leu 


His 




Gin 


Leu 


Cys 
355 


Cys 


Arg 


Met 


Pro 


Arg 
360 


Thr 


Leu Arg Arg 


Leu 
365 


Phe 


Val 


Ala 


45 


Glu 


Leu 
370 


Cys 


Ser 


Trp 


Met 


Ala 
375 


Leu 


Met 


Thr Phe Thr 
380 


Leu 


Phe 


Tyr 


Thr 




Asp 


Phe 


val 


Gly 


Glu 


Gly 


Leu 


Tyr 


Gin 


Gly val Pro 


Arg 


Ala 


Glu 


Pro 




385 










390 








395 








400 




Gly 


Thr 


Glu 


Ala 


Arg 


Arg 


His 


Tyr 


Asp 


Glu Gly Val 


Arg 


Met 


Gly 


Ser 


50 










405 










410 






415 






Leu 


Gly 


Leu 


Phe 
420 


Leu 


Gin 


Cys 


Ala 


He 
425 


Ser Leu Val 


Phe 


Ser 
430 


Leu 


Val 




Met 


Asp 


Arg 
435 


Leu 


val 


Gin 


Arg 


Phe 
440 


Gly 


Thr Arg Ala 


Val 
445 


Tyr 


Leu 


Ala 


ss 


Ser 


val 
450 


Ala 


Ala 


Phe 


Pro 


Val 
455 


Ala 


Ala 


Gly Ala Thr 
460 


Cys 


Leu 


Ser 


His 
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Ser Val 


Ala Val val 


Thr 


Ala 


Ser 


Ala 


Ala 


Leu 


Thr 


Gly 


Phe 


Thr 


Phe 


465 




470 










475 










480 


Ser Ala 


Leu Gin lie 
485 


Leu 


Pro 


Tyr 


Thr 


Leu 
490 


Ala 


Ser 


Leu 


Tyr 


His 
495 


Arg 


Glu Lys 


Gin Val Phe 
500 


Leu 


Pro 


Lys 


Tyr 
505 


Arg 


Gly 


Asp 


Thr 


Gly 
510 


Gly 


Ala 


Ser Ser 


Glu Asp Ser 
515 


Leu 


Met 


Thr 
520 


Ser 


Phe 


Leu 


Pro 


Gly 

525 


Pro 


Lys 


Pro 


Gly Ala 


Pro Phe Pro 


Asn 


Gly 


His 


Val 


Gly 


Ala 


Gly 


Gly 


Ser 


Gly 


Leu 


530 






535 










540 










Leu Pro 


Pro Pro Pro 


Ala 


Leu 


Cys 


Gly 


Ala 


Ser 


Ala 


Cys 


Asp 


Val 


Ser 


545 




550 










555 










560 


Val Arg 


Val Val Val 
565 


Gly 


Glu 


Pro 


Thr 


Glu 
570 


Ala 


Arg 


val 


Val 


Pro 
575 


Gly 


Arg Gly 


lie Cys Leu 
580 


Asp 


Leu 


Ala 


lie 
585 


Leu 


Asp 


Ser 


Ala 


Phe 
590 


Leu 


Leu 


Ser Gin 


Val Ala Pro 
595 


Ser 


Leu 


Phe 
600 


Met 


Gly 


Ser 


lie 


val 

605 


Gin 


Leu 


Ser 


Gin Ser 


Val Thr Ala 


Tyr 


Met 


Val 


Ser 


Ala 


Ala 


Gly 


Leu 


Gly 


Leu 


Val 


610 






615 










620 










Ala lie 


Tyr Phe Ala 


Thr 


Gin 


Val 


Val 


Phe 


Asp 


Lys 


Ser 


Asp 


Leu 


Ala 


625 




630 










635 










640 


Lys Tyr 


Ser Ala Gly 
645 


Gly 


His 


His 


His 


His 
650 


His 


His 











<210>42 
<211> 1959 
<212> DNA 
so <21 3> Artificial Sequence 

<220> 

<223> DNA encoding St.pneum. C-LytA P2 helper epitope C-Lyta fused to Human P501S (plus his tag) 
35 <400> 42 
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atggcggccg cttacgtaca ttccgacggc 
aatggcactt ggtactactt tgacagttca 
cacacagacg gcaactggta ctggttcgac 
aaaatcgctg ataagtggta ctatttcaac 
aagtacaagg acacttggta ctacttagac 
gctaactcta agttcattgg tatcactgaa 
tcagcggacg gaacaggctg gtactacctc 
gaaaagttca tgtacatggt gctgggcatt 
ctcctaggct cagccagtga ccactggcgt 
tgggcactgt ccttgggcat cctgctgagc 
gcagggctgc tgtgcccgga tcccaggccc 
gggctgctgg acttctgtgg ccaggtgtgc 
ctcttccggg acccggacca ctgtcgccag 
cttgggggct gcctgggcta cctcctgcct 
ccctacctgg gcacccagga ggagtgcctc 
tgcgtagcag ccacactgct ggtggctgag 
gaagggctgt cggccccctc cttgtcgccc 
ttccggaacc tgggcgccct gcttccccgg 
accctgcgcc ggctcttcgt ggctgagctg 
ctgttttaca cggatttcgt gggcgagggg 
ggcaccgagg cccggagaca ctatgatgaa 
ctgcagtgcg ccatctccct ggtcttctct 
ggcactcgag cagtctattt ggccagtgtg 
tgcctgtccc acagtgtggc cgtggtgaca 
tcagccctgc agatcctgcc ctacacactg 



tcttatccaa aagacaagtt tgagaaaatc 60 
ggctatatgc ttgcagaccg ctggaggaag 120 
aactcaggcg aaatggctac aggctggaag 180 
gaagaaggtg ccatgaagac aggctgggtc 240 
gctaaagaag gcgccatgca atacatcaag 300 
ggcgtcatgg tatcaaatgc ctttatccag 360 
aaaccagacg gaacactggc agacaggcca 420 
ggtccagtgc tgggcctggt ctgtgtcccg 480 
ggacgctatg gccgccgccg gcccttcatc 540 
ctctttctca tcccaagggc cggctggcta 600 
ctggagctgg cactgctcat cctgggcgtg 660 
ttcactccac tggaggccct gctctctgac 720 
gcctactctg tctatgcctt catgatcagt 780 
gccattgact gggacaccag tgccctggcc 840 
tttggcctgc tcaccctcat cttcctcacc 900 
gaggcagcgc tgggccccac cgagccagca 960 
cactgctgtc catgccgggc ccgcttggct 1020 
ctgcaccagc tgtgctgccg catgccccgc 1080 
tgcagctgga tggcactcat gaccttcacg 1140 
ctgtaccagg gcgtgcccag agctgagccg 1200 
ggcgttcgga tgggcagcct ggggctgttc 1260 
ctggtcatgg accggctggt gcagcgattc 1320 
gcagctttcc ctgtggctgc cggtgccaca 1380 
gcttcagccg ccctcaccgg gttcaccttc 1440 
gcctccctct accaccggga gaagcaggtg 1500 



ttcotgccca aataccgagg ggacactgga 
agcttcctgc caggccctaa gcctggagct 
ggcagtggcc tgctcccacc tccacccgcg 
gtacgtgtgg tggtgggtga gcccaccgag 
ctggacctcg ccatcctgga tagtgccttc 
atgggctcca ttgtccagct cagccagtct 
ctgggtctgg tcgccattta ctttgctaca 
aaatactcag cgggtggaca ccatcaccat 



ggtgctagca gtgaggacag cctgatgacc 1560 
cccttcccta atggacacgt gggtgctgga 1620 
ctctgcgggg cctctgcctg tgatgtctcc 1680 
gccagggtgg ttccgggccg gggcatctgc 1740 
ctgctgtccc aggtggcccc atccctgttt 1800 
gtcactgcct atatggtgtc tgccgcaggc 1860 
caggtagtat ttgacaagag cgacttggcc 1920 
caccattaa 1959 



<210> 43 
<211> 553 
<212> PRT 
<213> Homo sapiens 

<400> 43 
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Met 


Val 


Gin 


Arg 


Leu 


Trp 


Val Ser 


Arg Leu Leu Arg 


1 


3 




Ala 










5 












15 




Gin 


Leu 


Leu 


Leu 


Val 




Leu Leu 


Thr Phe Gly Leu 


- 

u 


1 


* 










20 












30 






Ala 


Ala 


Gly 


lie 


Thr 


Tyr 


a ro 


p 5 L L u 
ro eu eu 


Glu 


Val 


Gl 


Val 






35 








40 












Glu 


Glu 


Lys 


Phe 


Met 


Thr 


Met Val 


Leu Gly 116 Gly 


p 

pro 




eu 






50 










55 


«° 










Leu 


Val 


Cys 


Val 


Pro 


Leu 


Leu Gly 


Ser Ala Ser Asp 


, 

is 






Gl 


65 








70 












80 


Arg 


Tyr 


Gly 


Arg 


Arg 


Arg 


p 

ro e 


Ala Leu 


Ser 


Leu 


Glv 


He 


























Leu 


Leu 


Ser 


Leu 


8 ^ 
Phe 


Leu 


lie ro 


Al Gl T 

g a y *p 


L u 


Ala 


Glv 


Leu 








100 


















Leu 


Cys 


Pro 


Asp 


Pro 


Arg 


Pro Leu 


u eu a eu 




lie 


Leu 


Gl 






115 








ri° 












Val 


Gly 


Leu 


Leu 


Asp 


e 


cys y 


Gin Val Cys Phe 


Thr 


Pro 


Leu 


Glu 




130 










135 












Ala 


Leu 


Leu 


Ser 


Asp 


Leu 


Phe Arg 


Asp Pro Asp His 


Cys 


r 9 


n 


Al 

* 


145 










150 




ri 5 Gl 










Tyr 


Ser 


Val 


Tyr 


Ala 


Phe 


Met e 




cys 


eu 


Gl 


TV 

yr 








165 












175 




Leu 


Leu 


Pro 


Ala 


lie 


Asp 


Trp Asp 


Thr er la eu 


Al 

a 




Tv 
Tyr 


Leu 
eu 








180 












190 






Gly 


Thr 


Gin 


Glu 


u 


Cys 


Leu Phe 


Gl L L Thr 

* 


Leu 


lie 


Phe 


Leu 




195 








200 




205 








Thr 


Cys 


Val 


Ala 


Ala 


r 


Leu Leu 


Val Ala Glu Glu 


Ala 


Ala 


Leu 


Gly 




210 






















Pro 


Thr 


Glu 


Pro 


Ala 


Glu 


y Leu 


ser Ala ro 


L u 


Ser 


Pro 


His 


225 










230 




Al Ph Ar 5 A n 










Cys 


Cys 


Pro 


Cys 


Arg 


Ala 


Arg Leu 


a e g 


Leu 


Gl 


Ala 


Leu 










245 
















Leu 


Pro 


Arg 


Leu 


His 


Gin 


Leu Cys 


Cys rg e ro 


Ar 


Thr 


Leu 










260 












270 






Arg 


Leu 


Phe 


Val 


Ala 


Glu 


Leu Cys 


e**»- TT-r> Mo*- al a 

j?er i rp wee ai« 






Thr 


Phe 




275 




















Thr 


Leu 


Phe 


Tyr 


Thr 


Asp 




Gly Glu Gly Leu 


y 


Gl 


Gl 


Val 




290 










295 


300 










Pro 


Arg 


Ala 


Glu 


Pro 




Thr Glu 


Ala Arg Arg His 


Tyr 


Sp 


Glu 


Gly 










310 












320 


Val 


Arg 


Met 


Gly 


Ser 


Leu 


Gly Leu 


Phe Leu Gin Cys 


Ala 


lie 


Ser 


Leu 








325 






330 






335 




Val 


Phe 


Ser 


Leu 


Val 


Met 


Asp Arg 


Leu Val Gin Arg 


Phe 


Gly 


Thr 


Arg 








340 








345 




350 






Ala 


Val 


Tyr 
3S5 


Leu 


Ala 


Ser 


Val Ala 


Ala Phe Pro Val 


Ala 


Ala 


Gly 


Ala 












360 




365 









50 



71 



EP1 511 768 B1 





Thr 


370 










375 










380 










5 




Gly 






















Tyr 










385 










390 










395 










400 




Ser 


Leu 


Tyr 


His 


Arg 


Glu 


Lys 


Gin 


Val 


Phe 


Leu 


Pro 


Lys 


Tyr 


Arg 


Gly 




Asp 


Thr 


Gly 


Gly 


Ala 


Ser 


Ser 


Glu 


Asp 


Ser 






Thr 


Ser 


Ph 5 




10 








420 










425 










430 








Gly 


435 


Lys 




Gly 




440 








Gly 


445 




Gly 






Gly 


Gly 
450 


Ser 


Gly 


Leu 




455 






Pro 


Ala 


Leu 
460 


Cys 


Gly 


Ala 


Ser 




Ala 


Cys 


Asp 








Arg 


















Ala 


15 


465 










470 










475 










480 




Arg 


Val 


Val 


Pro 


Gly 
485 


Arg 


Gly 


He 


Cys 


Leu 
490 


Asp 


Leu 


Ala 


He 


Leu 
495 


Asp 




Ser 


Ala 


Phe 


Leu 
500 


Leu 


Ser 


Gin 


Val 


Ala 
505 


Pro 


Ser 


Leu 


Phe 


Met 
510 


Gly 


Ser 


20 


lie 


Val 


Gin 
515 


Leu 


Ser 


Gin 


Ser 


Val 
520 


Thr 


Ala 


Tyr 


Met 


Val 
525 


Ser 


Ala 


Ala 




Gly 


Leu 
530 


Gly 


Leu 


Val 


Ala 


He 
535 


Tyr 


Phe 


Ala 


Thr 


Gin 
540 


Val 


Val 


Phe 


Asp 




Lys 


Ser 


Asp 


Leu 


Ala 


Lys 


Tyr 


Ser 


Ala 

















25 545 SSO 



<210>44 
<211> 644 
so <212> PRT 

<213> Artificial Sequence 

<220> 

<223> St.pneum. C-LytA P2 helper epitope C-Lyta fused to Human P501S 

35 

<400> 44 
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Met 


A a 


A a 


A a 


Tyr 


Va 


His er 




er 


Tyr 


ro 


Lys 


Asp ys 










5 


















Ph 
Pne 


Glu 


Lys 








r rp 


Tyr Tyr 


Ph 


P 


er 




en tv 
y Tyr 




























e 


eu 


Al 


ap 


r 3 


r P 




H Th 




Gl 


As 




Tv 
y p 














rs 














Ph 


ASP 






y 




M f Al 
c a 


Thr Gly 


Trp 


ys 


ys 


e 


al a 
a sp 




50 
















60 










Trp 


Tyr 


Tyr 


Phe 




Glu Glu 


Gly Ala 


!,c 


Lys 


Th 

r 


y 


P CO 




























L 

ys 


Tyr 


ys 


Asp 




Trp 


Tyr Tyr 


eu sp 


AT 


ys 


Gl 

U 


y 


M t 

a e 










85 


















Gin 


Tyr 




Lys 


Ala 


sn 


Ser Lys 


c 6 




e 


Th 

r 




y al 








100 




















e 


a 


er 




3 


e 


e n 


s 

er a 


Asp 


Gl 

y 


Th 

r 


Gl 

y 


Tv 
rp yr 






115 
















PI 5 






Tyr 




Lys 


Pro 


Asp 


Gly 


Thr Leu 


la Asp 


Arg 




U 


ys 


e et 




130 
















140 








Tyr 


Met 


Val 


Leu 


Gly 


He 


Gly Pro 


Val Leu 


Gly 


Leu 


Val 


Cys 


Val Pro 


145 










150 






155 








160 


Leu 


Leu 


Gly 


Ser 


Ala 


Ser 


Asp His 


Trp Arg 


Gly 


Arg 


Tyr 


Gly 


Arg Arg 










165 






170 










175 


Arg 


Pro 


Phe 


He 


Trp 


Ala 


Leu Ser 


Leu Gly 


He 


Leu 


Leu 


Ser 


Leu Phe 








180 








185 








190 




Leu 


He 


Pro 


Arg 


Ala 


Gly 


Trp Leu 


Ala Gly 


Leu 


Leu 


Cys 


Pro 


Asp Pro 



73 









195 










Arg 


Pro 


Leu 


Glu 


Leu 


Ala 


5 




210 












Phe 


Cys 


Gly 


Gin 


Val 


Cys 




225 










230 




Leu 


Phe 


Arg 


Asp 


Pro 
245 


Asp 


10 


Phe 


Met 


lie 


Ser 
260 


Leu 


Gly 




Asp 


Trp 


Asp 
275 


Thr 


Ser 


Ala 




Cys 


Leu 


Phe 


Gly 


Leu 


Leu 






290 










Thr 
305 


Leu 


Leu 


Val 


Ala 


Glu 
310 




Glu 


Gly 


Leu 


Ser 


Ala 
325 


Pro 


20 


Ala 


Arg 


Leu 


Ala 


Phe 


Arg 








340 








Gin 


Leu 


Cys 
355 


Cys 


Arg 


Met 




Glu 


Leu 
370 


Cys 


Ser 


Trp 


Met 


25 


Asp 
385 


Phe 


Val 


Gly 


Glu 


Gly 
390 




Gly 


Thr 


Glu 


Ala 


Arg 
405 


Arg 




Leu 


Gly 


Leu 


Phe 


Leu 


Gin 


30 








420 








Met 


Asp 


Arg 
435 


Leu 


val 


Gin 




Ser 


val 
450 


Ala 


Ala 


Phe 


Pro 


35 


Ser 

465 


Val 


Ala 


Val 


val 


Thr 
470 




Ser 


Ala 


Leu 


Gin 


He 
485 


Leu 




Glu 


Lys 


Gin 


Val 


Phe 


Leu 










500 






Ser 


Ser 


Glu 
515 


Asp 


Ser 


Leu 




Gly 


Ala 
530 


Pro 


Phe 


Pro 


Asn 




Leu 


Pro 


Pro 


Pro 


Pro 


Ala 


45 


545 










550 




Val 


Arg 


Val 


Val 


Val 
565 


Gly 




Arg 


Gly 


lie 


Cys 
580 


Leu 


Asp 


50 


Ser 


Gin 


Val 
595 


Ala 


Pro 


Ser 




Gin 


Ser 
610 


Val 


Thr 


Ala 


Tyr 




Ala 


lie 


Tyr 


Phe 


Ala 


Thr 


SS 


625 










630 




Lys 


Tyr 


Ser 


Ala 
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200 


205 




Leu 


Leu He 


Leu Gly Val Gly 


Leu Leu Asp 


215 




220 




Phe 


Thr Pro 


Leu Glu Ala Leu 


Leu Ser Asp 






235 


240 


His 


Cys Arg 


Gin Ala Tyr Ser 


Val Tyr Ala 






250 


255 


Gly 


Cys Leu 


Gly Tyr Leu Leu 


pro Ala lie 




265 




270 


Leu 


Ala Pro 


Tyr Leu Gly Thr 


Gin Glu Glu 




280 


285 




Thr 


Leu He 


Phe Leu Thr Cys 


Val Ala Ala 


295 




300 




Glu 


Ala Ala 


Leu Gly Pro Thr 


Glu Pro Ala 






315 


320 


Ser 


Leu Ser 


Pro His Cys Cys 


Pro Cys Arg 






330 


335 


Asn 


Leu Gly 


Ala Leu Leu Pro 


Arg Leu His 




345 




350 


Pro 


Arg Thr 


Leu Arg Arg Leu 


Phe Val Ala 




360 


365 




Ala 


Leu Met 


Thr Phe Thr Leu 


Phe Tyr Thr 


375 




380 




Leu 


Tyr Gin 


Gly Val Pro Arg 


Ala Glu Pro 






395 


400 


His 


Tyr Asp 


Glu Gly Val Arg 


Met Gly Ser 






410 


415 


Cys 


Ala He 


Ser Leu Val Phe 


Ser Leu Val 




425 




430 


Arg 


Phe Gly 


Thr Arg Ala Val 


Tyr Leu Ala 




440 


445 




Val 


Ala Ala 


Gly Ala Thr Cys 


Leu Ser His 


455 




460 




Ala 


Ser Ala 


Ala Leu Thr Gly 


Phe Thr Phe 






475 


480 


Pro 


Tyr Thr 


Leu Ala Ser Leu 


Tyr His Arg 






490 


495 


Pro 


Lys Tyr 


Arg Gly Asp Thr 


Gly Gly Ala 




505 




510 


Met 


Thr Ser 


Phe Leu Pro Gly 


Pro Lys Pro 




520 


525 




Gly 


His Val 


Gly Ala Gly Gly 


Ser Gly Leu 


535 




540 




Leu 


Cys Gly 


Ala Ser Ala Cys 


Asp Val Ser 






555 


560 


Glu 


Pro Thr 


Glu Ala Arg Val 


Val Pro Gly 






570 


575 


Leu 


Ala He 


Leu Asp Ser Ala 


Phe Leu Leu 




585 




590 


Leu 


Phe Met 


Gly Ser He Val 


Gin Leu Ser 




600 


605 




Met 


Val Ser 


Ala Ala Gly Leu 


Gly Leu Val 


615 




620 




Gin 


Val Val 


Phe Asp Lys Ser 


Asp Leu Ala 






635 


640 



74 
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<220> 

<223> Codon-optimised hybrid protein between St.pneum. C-LytA P2 helper epitope C-Lyta fused to Human P501 S 
amino acids 51-553) 

10 <400> 45 

15 
20 
25 
30 
35 
40 
45 
SO 



75 
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Met 


Ala 


Ala 


Ala 


Tyr 


Val 


His 


Ser Asp 


Gly 


Ser 


Tyr 


Pro 


Lys 


Asp 


Lys 




1 








5 








if 
















Phe 


Glu 


Lys 




Asn 


Gly 


r 


T Tv 


Tyr 


e 


sp 




Ser 


Gl 


TV 
* 


5 








20 








25 










30 








Met 


Leu 


Ala 
35 


Asp 


Arg 


Trp 


Arg 


Lys His 


Thr 


Asp 


Gly 


Asn 


Trp 


Tyr 


Trp 




Phe 


Aep 
50 


Asn 


Ser 


Gly 


Glu 


Met 


Ala Thr 


Gly 


Trp 


Lys 


Lys 


He 


Ala 


Asp 


10 


Lys 


Trp 


Tyr 


Tyr 


Phe 


Asn 


Glu 


Glu Gly 


Ala 


Met 


Lys 


Thr 


Gly 


Trp 


Val 


65 
































Lys 


Tyr 


Lys 


Asp 


Thr 
85 


Trp 


Tyr 


Tyr Leu 


Asp 
90 


Ala 


Lys 


Glu 


Gly 


Ala 


Met 




Gin 


Tyr 


lie 


100 


Ala 


Asn 


Ser 


Lys Phe 


lie 


Gly 




r 


110 


Gly 


a 


15 


Met 


val 


Ser 
115 


Asn 


Ala 


Phe 


He 


Gin Ser 
120 


Ala 


Asp 


Gly 


Thr 


Gly 


Trp 


Tyr 




Tyr 


Leu 
130 


Lys 


Pro 


Asp 


Gly 


Thr 


Leu Ala 


Asp 


Arg 


Pro 


Glu 


Lys 


Phe 


Met 




Tyr 


Met 


Val 


Leu 


Gly 


He 


Gly 


Pro Val 


Leu 


Gly 


Leu 


Val 


Cys 


Val 


Pro 




145 










150 






















Leu 


Leu 


Gly 


Ser 


Ala 
165 


Ser 


Asp 


His Trp 


Arg 
170 


Gly 


Arg 


Tyr 


Gly 


Arg 
175 


Arg 




Arg 


Pro 


Phe 


180 


Trp 


Ala 


Leu 


Ser Leu 


y 


e 


eu 


eu 


190 


eu 


e 


25 


Leu 


lie 


Pro 

195 


Arg 


Ala 


Gly 


Trp 


Leu Ala 


Gly 


Leu 


Leu 


Cys 


Pro 


Asp 


Pro 




Arg 


Pro 


Leu 


Glu 


Leu 


Ala 


Leu 
215 


Leu He 


Leu 


Gly 


Val 


Gly 


Leu 


Leu 


Asp 




Phe 


Cys 


Gly 


Gin 


Val 




Phe 


Pro 


eu 




a 


eu 


eu 


er 




30 


225 










230 








235 










240 


Leu 


Phe 


Arg 


Asp 


Pro 
245 


Asp 


His 


Cys Arg 


Gin 
250 


Ala 


Tyr 


Ser 


Val 


Tyr 
a? 5 


Ala 




Phe 


Met 


lie 


Ser 
260 


Leu 


Gly 


Gly 


Cys Leu 
265 


Gly 


Tyr 


eu 


eu 




a 


e 




Asp 


Trp 


Asp 


Thr 


Ser 


Ala 


Leu 


Ala Pro 


Tyr 


Leu 


G y 


r 




u 


u 


35 


275 










280 








285 










Cya 


Leu 
290 


Phe 


Gly 


Leu 


Leu 


Thr 
295 


Leu He 




eu 




ys 


V 


a 


3 






Leu 


Leu 


Val 


Ala 


Glu 


Glu 


Ala Ala 


eu 




ro 




u 








305 


















315 










320 


40 


Glu 


Gly 


Leu 


Ser 


Ala 


Pro 


Ser 


Leu Ser 


Pro 


His 


Cys 


Cys 


Pro 


Cys 


Arg 










325 








330 










335 






Ala 


Arg 


Leu 


Ala 


Phe 


Arg 


Asn 


Leu Gly 


Ala 


Leu 


Leu 


Pro 


Arg 


Leu 


. 

His 








340 








345 


















Gin 


Leu 


Cys 


Cys 


Arg 


Met 


Pro 


Arg Thr 


Leu 


Arg 


Arg 


Leu 


Phe 


val 


Ala 




































Glu 


Leu 
370 


Cys 


Ser 


Trp 


Met 


Ala 
375 


Leu Met 


Thr 


Phe 


Thr 
380 


Leu 


Phe 


Tyr 


Thr 




Asp 


Phe 


Val 


Gly 


Glu 


Gly 


Leu 


Tyr Gin 


Gly 


Val 


Pro 


Arg 


Ala 


Glu 


Pro 




385 










390 








395 










400 


50 


Gly 


Thr 


Glu 


Ala 


Arg 
405 


Arg 


His 


Tyr Asp 


Glu 
410 


Gly 


Val 


Arg 


Met 


Gly 
415 


Ser 



76 
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Leu 


Gly 


Leu 


Phe 
420 


Leu 


Gin 


Cys 


Ala 


He 
425 


Ser 


Leu 


Val 


Phe 


Ser 
430 


Leu 


Val 












val 


Gin 


Arg 


Phe 


Gly 


Thr 




Ala 


val 


Tyr 


Leu 


Ala 


5 






435 










440 










445 










Ser 


Val 
450 


Ala 


Ala 


Phe 


Pro 


Val 
455 


Ala 


Ala 


Gly 


Ala 


Thr 
460 


Cys 


Leu 


Ser 


His 




Ser 


Val 


Ala 


Val 


Val 


Thr 


Ala 


Ser 


Ala 


Ala 


Leu 


Thr 


Gly 


Phe 


Thr 


Phe 




465 










470 










475 










480 


10 


Ser 


Ala 




Gin 


He 
485 




Pro 


Tyr 


Thr 


490 


Ala 


Ser 


Leu 


Tyr 


His 
495 


Arg 




Glu 


Lys 


Gin 


Val 
500 


Phe 


Leu 


Pro 


Lys 


Tyr 
505 


Arg 


Gly 


Asp 


Thr 


Gly 
510 


Gly 


Ala 




Ser 


Ser 


Glu 




Ser 


Leu 


Met 


Thr 


Ser 


Phe 


Leu 


Pro 


Gly 


Pro 


Lys 


Pro 


15 






515 










520 










525 










Gly 


Ala 
530 


Pro 


Phe 


Pro 




Gly 
535 


His 


Val 


Gly 


Ala 


Gly 

540 


Gly 


Ser 


Gly 






Leu 


Pro 


Pro 


Pro 


Pro 


Ala 




Cys 


Gly 


Ala 


Ser 


Ala 


Cys 


Asp 


Val 


Ser 




545 










550 










555 










560 


20 


Val 


Arg 


Val 


Val 


Val 


Gly 


Glu 


Pro 


Thr 


Glu 


Ala 


Arg 


Val 


Val 


Pro 


Gly 










565 










570 










575 






Arg 


Gly 


lie 


Cys 
580 




Asp 


Leu 


Ala 


He 
585 


Leu 


Asp 


Ser 


Ala 


Phe 
590 


Leu 


Leu 




































25 






595 










600 










605 








Gin 


Ser 
610 


Val 


Thr 


Ala 


Tyr 


Met 
615 


Val 


Ser 


Ala 


Ala 


Gly 
620 


Leu 


Gly 


Leu 


Val 




Ala 


lie 


Tyr 


Phe 


Ala 


Thr 


Gin 


Val 


Val 


Phe 


Asp 


Lys 


Ser 


Asp 


Leu 


Ala 




625 










630 










635 










640 


30 


Lys 


Tyr 


Ser 


Ala 



























<210>46 
<211>694 
35 <212> PRT 

<213> Artificial Sequence 

<220> 

<223> St.pneum. C-LytA P2 helper epitope C-Lyta fused to Human P501S (amino acids 1-553)- codon optimised 
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<400> 46 
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Met 


Ala 


Ala 


Ala 


Tyr 


Val 


His Ser 


Asp Gly 


Ser 


Tyr 


Pro 


Lys 


Asp 


Lys 










5 




















Ph 

e 


u 


ys 




s 


_ 

Qiy 


Th Tro 


TV TV 
25 


Phe 


A 


Ser 


Ser 
30 


Gl 


Tvr 


Met 


Leu 


Ala 


Asp 


Arg 


Trp 


Arg Lys 


His Thr 


Asp 


Gly 


Asn 


Trp 


Tyr 


Trp 


Phe 


Asp 
50 


Asn 


Ser 


Gly 


Glu 


Met Ala 


Thr Gly 


Trp 


Lys 


Lys 


He 


Ala 


Asp 


Lys 


Trp 


Tyr 


Tyr 


Phe 


Asn 


Glu Glu 


Gly Ala 


Met 


Lys 


Thr 


Gly 


Trp 


Val 






























Lys 


Tyr 


Lys 


Asp 


Thr 


Trp 


Tyr Tyr 


Leu Asp 


Ala 


Lys 


Glu 


Gly 


Ala 


Met 


Gin 


Tyr 


He 


Lys 
100 


Ala 


Asn 


Ser Lys 


Phe He 
105 


Gly 


He 


Thr 


Glu 
110 


Gly 


Val 


Met 


Val 


Ser 
115 


Asn 


Ala 


Phe 


He Gin 
120 


Ser Ala 


Asp 


Gly 


Thr 
125 


Gly 


Trp 


Tyr 


Tyr 


Leu 
130 


Lys 


Pro 


Asp 


Gly 


Thr Leu 
135 


Ala Asp 


Arg 


Pro 
140 


Glu 


Met 


Val 


Gin 


Arg 


Leu 


Trp 


val 


Ser 


Arg 


Leu Leu 


Arg His 


Arg 


Lys 


Ala 


Gin 


Leu 


Leu 



78 



EP1 511 768 B1 



145 










150 










155 










160 


Leu 


Val 


Asn 


Leu 


Leu 


Thr 


Phe 


Gly 


Leu 


Glu 


Val 


Cys 


Leu 


Ala 


Ala 


Gly 










165 










170 










175 




lie 


Thr 


Tyr 


Val 


pro 


Pro 


Leu 


Leu 


Leu 


Glu 


Val 


Gly 


Val 


Glu 


Glu 


Lys 








180 










185 










190 




Phe 


Met 


Thr 


Met 


Val 


Leu 


Gly 


He 


Gly 


Pro 


Val 


Leu 


Gly 


Leu 


Val 


Cys 






195 










200 










205 








Val 


Pro 


Leu 


Leu 


Gly 


Ser 


Ala 


Ser 


Asp 


His 


Trp 


Arg 


Gly 


Arg 


Tyr 


Gly 




210 










215 










220 










Arg 


Arg 


Arg 


Pro 


Phe 


He 


Trp 


Ala 


Leu 


Ser 


Leu 


Gly 


He 


Leu 


Leu 


Ser 


225 










230 










235 










240 


Leu 


Phe 


Leu 


He 


Pro 


Arg 


Ala 


Gly 


Trp 


Leu 


Ala 


Gly 


Leu 


Leu 


Cys 


Pro 










245 










250 










255 




Asp 


Pro 


Arg 


Pro 


Leu 


Glu 


Leu 


Ala 


Leu 


Leu 


He 


Leu 


Gly 


Val 


Gly 


Leu 








260 










265 










270 






Leu 


Asp 


Phe 


Cys 


Gly 


Gin 


Val 


Cys 


Phe 


Thr 


Pro 


Leu 


Glu 


Ala 


Leu 


Leu 






275 










280 










285 








Ser 


Asp 


Leu 


Phe 


Arg 


Asp 


Pro 


Asp 


His 


Cys 




Gin 


Ala 


Tyr 


Ser 


Val 




290 










295 










300 










Tyr 


Ala 


Phe 


Met 


He 


Ser 


Leu 


Gly 


Gly 


Cys 




Gly 


Tyr 


Leu 


Leu 


Pro 


305 










310 










315 










320 


Ala 


He 


Asp 


Trp 


Asp 


Thr 


Ser 


Ala 


Leu 


Ala 


Pro 


Tyr 


Leu 


Gly 


Thr 


Gin 










325 










330 










335 




Glu 


Glu 


Cys 


Leu 


Phe 


Gly 


Leu 


Leu 


Thr 




He 


Phe 


Leu 


Thr 


Cys 


Val 








340 










345 










350 






Ala 


Ala 


Thr 


Leu 


Leu 


Val 


Ala 


Glu 


Glu 


Ala Ala 


Leu 


Gly 


Pro 


Thr 


Glu 






355 










360 










365 








Pro 


Ala 


Glu 


Gly 


Leu 


Ser 


Ala 


Pro 


Ser 


Leu 


Ser 


Pro 


His 


Cys 


Cys 


Pro 




370 










375 










380 










Cys 


Arg 


Ala 


Arg 


Leu 


Ala 


Phe 


Arg 


Asn 


Leu Gly 


Ala 


Leu 


Leu 


Pro 


Arg 


385 










390 










395 










400 


Leu 


His 


Gin 


Leu 


Cys 


Cys 


Arg 


Met 


Pro 


Arg 


Thr 


Leu 


Arg 


Arg 


Leu 


Phe 










405 










410 










415 




Val 


Ala 


Glu 


Leu 


Cys 


Ser 


Trp 


Met 


Ala 


Leu 


Met 


Thr 


Phe 


Thr 


Leu 


Phe 








420 










425 










430 






Tyr 


Thr 


Asp 


Phe 


Val 


Gly 


Glu 


Gly 


Leu 


Tyr Gin 


Gly 


Val 


Pro 


Arg 


Ala 






435 










440 










445 








Glu 


Pro 


Gly 


Thr 


Glu 


Ala 


Arg 


Arg 


His 


Tyr Asp 


Glu 


Gly 


val 


Arg 


Met 




450 










455 










460 










Gly 


Ser 


Leu 


Gly 


Leu 


Phe 


Leu 


Gin 


Cys 


Ala 


He 


Ser 


Leu 


Val 


Phe 


Ser 


465 










470 










475 










480 


Leu 


Val 


Met 


Asp 


Arg 


Leu 


Val 


Gin 


Arg 


Phe 


Gly 


Thr 


Arg 


Ala 


Val 


Tyr 










485 










490 










495 




Leu 


Ala 


Ser 


Val 


Ala 


Ala 


Phe 


Pro 


Val 


Ala 


Ala 


Gly 


Ala 


Thr 


Cys 


Leu 








500 










505 










510 






Ser 


His 


Ser 


Val 


Ala 


Val 


Val 


Thr 


Ala 


Ser 


Ala 


Ala 


Leu 


Thr 


Gly 


Phe 






515 










520 










525 








Thr 


Phe 


Ser 


Ala 


Leu 


Gin 


He 


Leu 


Pro 


Tyr 


Thr 


Leu 


Ala 


Ser 


Leu 


Tyr 




530 










535 










540 










His 


Arg 


Glu 


Lys 


Gin 


Val 


Phe 


Leu 


Pro 


Lys Tyr 


Arg 


Gly 


Asp 


Thr 


Gly 


545 










550 










555 










560 


Gly 


Ala 


Ser 


ser 


Glu 


Asp 


Ser 


Leu 


Met 


Thr 


Ser 


Phe 


Leu 


Pro 


Gly 


Pro 










5S5 










570 










575 






















His 


val 


















580 










585 










590 






Gly 






Pro 


Pro 


Pro 


Pro 


Ala 


Leu 


Cys Gly 


Ala 


Ser 


Ala 


Cys 


Asp 






535 










600 










60S 








Val 


Ser 


Val 


Arg 


Val 


Val 


Val 


Gly 


Glu 


Pro 


Thr 


Glu 


Ala 


Arg 


Val 


Val 




610 










615 










620 










Pro 


Gly 


Arg 


Gly 


He 


Cys 


Leu 


Asp 


Leu 


Ala 


He 


Leu 


Asp 


Ser 


Ala 


Phe 


625 










530 










635 










64 0 



79 



EP1 511 768 B1 



Leu Leu Ser Gin Val Ala Pro Ser Leu Phe Met Gly Ser lie Val Gin 

645 650 655 

Leu Ser Gin Ser Val Thr Ala Tyr Met Val ser Ala Ala Gly Leu Gly 

660 665 670 

Leu Val Ala He Tyr Phe Ala Thr Gin Val Val Phe Asp Lys Ser Asp 

675 680 685 

Leu Ala Lys Tyr Ser Ala 
690 



<212> PRT 

<213> Artificial Sequence 
<220> 

<223> St.pneum. C-LytA P2 helper epitope C-Lyta fused to Human P501S (amino acids 51-553) fused to Human 
P501S (amino acids 1-50) - codon-optimised 
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Met 


Ala 


Ala 


Ala 


Tyr 


Val 


His 


ser 


Asp 


Gly 


Ser 


Tyr 


Pro 


Lys 


Asp 


Lys 




l 








5 










10 
















Phe 


Glu 


Lys 


He 




Gly 


Thr 


Trp 


Tyr 
25 


Tyr 


Phe 


Asp 


Ser 


Ser 


Gly 


Tyr 


5 


Met 


Leu 


Ala 


Asp 


Arg 


Trp 


Arg 


Lys 


His 


Thr 


Asp 


Gly 


Asn 


Trp 


Tyr 


Trp 




Phe 


Asp 


Asn 


Ser 


Gly 


Glu 


Met 


Ala 


Thr 


Gly 


Trp 


Lys 


Lys 


He 


Ala 


Asp 




Lys 


Trp 


Tyr 


Tyr 


Phe 


Asn 


Glu 


Glu 


Gly 


Ala 


Met 


Lys 


Thr 


Gly 


Trp 


Val 


10 




































Lys 


Tyr 


Lys 


Asp 


Thr 


Trp 


Tyr 


Tyr 


Leu 


Asp 


Ala 


Lys 


Glu 


Gly 


Ala 


Met 




Gin 


Tvr 


lie 


Lys 
100 


Ala 


Asn 


Ser 


LVS 


Phe 
105 


He 


Gly 


He 


Thr 


Glu 
110 


Gly 


Val 


15 


Met 


Val 


Ser 


Asn 


Ala 


Phe 


He 


Gin 


Ser 


Ala 


Asp 


Gly 


Thr 
125 


Gly 


Trp 


Tyr 




Tyr 


Leu 
130 


Lys 


Pro 


Asp 


Gly 


Thr 
135 


Leu 


Ala 


Asp 


Arg 


Pro 
140 


Glu 


Lys 


Phe 


Met 




Tyr 


Met 


Val 


Leu 


Gly 


He 


Gly 


Pro 


Val 


Leu 


Gly 




Val 


Cys 


Val 


Pro 
























155 












Leu 


Leu 


Gly 


Ser 


Ala 
165 


Ser 


Asp 


His 


Trp 


Arg 
170 


Gly 


Arg 


Tyr 


Gly 


Arg 
175 


Arg 




Arg 


Pro 


Phe 


He 
180 


Trp 


Ala 


Leu 


Ser 


Leu 
185 


Gly 


He 


Leu 


Leu 


Ser 
190 




Phe 


25 


Leu 


Xle 


Pro 


Arg 


Ala 


Gly 


Trp 


Leu 


Ala 


Gly 


Leu 


Leu 


Cys 


Pro 


Asp 


Pro 






195 










200 










205 










Arg 


Pro 


Leu 


Glu 


Leu 


Ala 


Leu 
215 


Leu 


He 


Leu 


Gly 


Val 
220 


Gly 


Leu 


Leu 


Asp 




Phe 


Cys 


Gly 


Gin 


Val 


Cys 


Phe 


Thr 


Pro 


Leu 


Glu 


Ala 


Leu 


Leu 


Ser 


ASD 


30 
































240 


Leu 


Phe 


Arg 


Asp 


Pro 


Asp 


His 


Cys 


Arg 


Gin 


Ala 


Tyr 


Ser 


Val 


Tyr 


Ala 




Phe 


Met 


lie 


Ser 
260 


Leu 


Gly 


Gly 


Cys 


Leu 
265 


Gly 


Tyr 


Leu 


Leu 


Pro 
270 


Ala 


He 




Aep 


Trp 


Asp 


Thr 


Ser 


Ala 


Leu 


Ala 


Pro 


Tyr 


Leu 


Gly 


Thr 


Gin 


Glu 


Glu 


35 






275 










280 










285 










Cys 


Leu 
290 


Phe 


Gly 


Leu 


Leu 


Thr 
295 


Leu 


He 


Phe 


Leu 


Thr 
300 


Cys 


Val 


Ala 


Ala 




Thr 


Leu 


Leu 


Val 


Ala 


Glu 


Glu 


Ala 


Ala 


Leu 


Gly 


Pro 


Thr 


Glu 


Pro 


Ala 




305 










310 










315 










320 



40 



45 



50 



81 
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Glu 


Gly 


Leu 


Ser ^* Pr ° Ser Leu Ser 




as 


Cys 


cy 






Arg 








3 30 










335 






Arg 


Leu 


Ala Phe Arg Asn Leu Gly 


Ala 


eu 


eu 


r 


Arg 


Leu 


His 
















350 






Gin 




Cys 


Cys Arg Met Pro Arg Tnr 


eu 


9 


Arg 




Phe 


V 1 


Al 






355 










365 








Glu 


Leu 


Cys 


Ser Trp Met Ala Leu Met 


r 


e 


Th 


Leu 
eu 


Phe 


yr 


Thr 
r 




370 




















Asp 


Phe 


Val 


Gly Glu Gly Leu Tyr Gin 


Gly 




Pro 


9 


Ala 


Glu 
U 




385 




















400 


Gly 


Thr 


Glu 


Ala Arg Arg His Tyr Asp 




Gly 


a 


9 






Ser 








410 










415 




Leu 


Gly 


Leu 


Phe Leu Gin Cys Ala He 


Ser 


Leu 


a 


e 




eu 


V 1 

a 






420 425 










430 






Met 


Asp 


Arg 


Leu Val Gin Arg Phe Gly 


T r 


Arg 


A a 




Tyr 


eu 


a 






435 










445 








Ser 


Val 


Ala 


Ala Phe Pro Val Ala Ala 


G y 


a 




Cys 


eu 


er 


is 




450 




455 






460 










Ser 


Val 


Ala 


Val Val Thr Ala Ser Ala 


Ala 




Thr 


Gly 


P e 


r 




465 






470 _ mv 














4RQ 


Ser 


Ala 


Leu 


Gin Xle Leu Pro Tyr Thr 


eu 




er 


eu 


Tyr 




r 9 








485 
















Glu 


Lys 


Gin 


Val Phe Leu Pro Lys Tyr 


Arg 


Gly 


Asp 


r 




y 










500 505 
















Ser 


Ser 


Glu 


Asp Ser Leu Met Thr Ser 


P e 


Leu 


Pro 




D 

ro 


ys 


Pro 
ro 






515 










525 








Gly 


Ala 


Pro 


Phe Pro Asn Gly His Val 


G y 


a 




y 


er 


y 


eu 


530 




















Leu 


Pro 


Pro 


Pro Pro Ala Leu Cys Gly 


A a 




A a 


Cys 


Asp 


V 




545 










S55 










560 


Val 


Arg 




Val Val Gly Glu Pro Thr 




Ala 


Arg 


3 


3 




y 








*™ 














Arg 


Gly 


He 


Cys Leu Asp Leu Ala He 


eu 


sp 


er 






eu 


Leu 
eu 






580 585 










590 






Ser 


Gin 


Val 


Ala Pro Ser Leu Phe Met 


Gly 


Ser 


6 




n 


eu 


er 






595 


600 








605 








Gin 


Ser 


Val 


Thr Ala Tyr Met Val Ser 


Ala 


Ala 




eu 


iy 


eu 


a 




610 




















Ala 


He 


Tyr 


Phe Ala Thr Gin Val Val 


6 


Asp 


ys 


er 


sp 


eu 


Ala 






















Lys 


Tyr 


Ser 


Ala Met Val Gin Arg Leu 


Trp 


Val 


Ser 


Arg 


Leu 


Leu 


Arg 




645 


650 










655 




His 


Arg 


Lys 


Ala Gin Leu Leu Leu Val 


Asn 


Leu 


Leu 


Thr 


Phe 


Gly 


Leu 








660 665 










670 






Glu 


Val 


Cys 


Leu Ala Ala Gly He Thr 


Tyr 


Val 


Pro 


Pro 


Leu 


Leu 


Leu 






675 


680 








685 








Glu 


Val 


Gly 


Val Glu Glu 


















690 





















<210>48 
<211>694 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Human P501S (amino acids 1-50) fused to St.pneum. C-LytA P2 helper epitope C-Lyta fused to Human 
P501S (amino acids 51-553) - codon optimised 
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<400> 48 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
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A 

a n g 


Leu Trp 


Val 


Ser Ar 




Leu 


Arq 


His 


Arg 


Lys 


Ala 






5 






io U 










15 




n 




Val Asn 


Leu 


Leu Thr 


Phe 


Glv 


Leu 


Glu 


Val 


Cys 


Leu 




















30 






Ala 


Ala Gly lie 


Thr Tyr 


Val 


Pro Pro 


Leu 


Leu 


Leu 


Glu 


Val 


Gly 


Val 










40 








45 








Glu 


Glu Met Ala 


Ala Ala 


Tvr 




Ser 


Asp 


Gly 


Ser 


Tyr 


Pro 


Lys 
















60 












l° Ph Gl 
ys e u 


Lys He 


Asn 


Gly Thr 


Tro 


Tyr 


Tvr 


Phe 


Asp 


Ser 


Ser 


65 P 




70 








75 










80 


y 


Tyr e eu 


Ala Asp 


















Trp 












90 










95 




Tyr 


Ph A 


Asn Ser 


Glv 


Glu Met 


Ala 


Thr 


Glv 


Trp 


Lys 


Lys 


He 




^ 6 ion 
















110 






a 


ALT 


Tvr Tvr 


Phe 


Asn Glu 


GlU 


Glv 


Ala 


Met 


LVS 


Thr 


Glv 




P 115 ^ 






120 








125 








Trp 


V 1 L Tvr 


y P 


Thr 


Trp Tyr 


Tvr 


Leu 


Asp 


Ala 


LVS 


Glu 


Glv 




130 ^ 




135 








140 










* 


Met Gin Tyr 


He L s 


Ala 


Asn Ser 


Lvs 


Phe 


He 


Gly 


He 


Thr 


Glu 






6 150 


















160 


y 


Val Met Val 


Ser Asn 


Ala 


Phe He 


Gin 


Ser 


Ala 


Asp 


Gly 


Thr 


Gly 






165 






170 










175 




Trp 


Tyr Tyr eu 


s Pro 
ys ro 


sp 


Gly Thr 


Leu 


Ala 


Asp 


Arq 


Pro 


Glu 


Lvs 




















190 






Phe 




V 1 Leu 


Gl 


He Glv 


Pro 


Val 


Leu 


Gly 




Val 


Cvs 




6 195 6 






200 
















Val 




Gl Ser 
y er 


Ala 


Ser Asp 


His 


Trp 


Arg 


Glv 


Arq 


Tvr 


Glv 




210 ^ eU 




215 








220 










9 


g g ro 


Phe He 




Ala Leu 


Ser 


Leu 


Gly 


He 


Leu 


Leu 


Ser 






230 








235 










240 


eu 


Ph L II 
e eu e 


Pro Ar 


Ala 


Gl Trt> 
y rp 


Leu 




Glv 


Leu 


Leu 


Cys 








245 






250 










255 




sp 


PAP 

ro r9 *° 


Leu Glu 
eu u 


Leu 
eu 


Ala Leu 


Leu 


He 


Leu 


Gly 


Val 


Gly 


Leu 








265 










270 






Leu 


Asp Pne cys 


Gly Gin 


Val 


Cvs Phe 


Thr 


Pro 


Leu 


Glu 


Ala 


Leu 


Leu 










280 








285 








er 




A A 

rg sp 


Pro 


Asp His 


Cvs 


A 


Gin 


Ala 


Tvr 


Ser 


Val 




290 6U 6 




29S 








300 










TV 


Ala Phe Met 


He Ser 


Leu 


Gly Gly 


Cvs 


Leu 


Glv 


Tyr 


Leu 


Leu 


Pro 


305 




310 








315 










320 


Al 


He Asp Tro 


Asp Thr 


Ser 


Ala Leu 


Ala 




Tyr 


Leu 


Gly 


Thr 


Gin 






325 






330 










335 




Glu 


Glu Cys Leu 


Phe Gly 


Leu 


Leu Thr 




He 


Phe 


Leu 


Thr 


Cys 


Val 




340 






345 










350 








Al Th L u 


Leu Val 


Ala 


Glu Glu 


Ala 


Ala 


Leu 


Gly 


Pro 


Thr 


Glu 




3 355 SU 














365 








ro 


Ala Glu Gly 


Leu Ser 
eu 


Ala 


Pro Ser 
ro e 


Leu 


Ser 


Pro 


His 


Cys 


Cvs 


Pro 








375 








380 










cys 


A. 70 Al A 
g a rg 


Ala 


Phe 


A An 
rg sn 


Leu 


Gly 




Leu 


Leu 


Pro 


Arg 


385 




6U 390 








395 










400 


eu 


is n u 




rg 


Met Pro 






Leu 


Arq 


Arq 


Leu 








40S 






410 










415 




Val 


Ala Glu Leu 


Cys Ser 
cys er 


T 

rp 


Met Ala 


Leu 


Met 


Thr 


Phe 


Thr 


Leu 


Phe 




420 






425 










430 






Tyr 


Thr Asp Phe 


Val Gly 


Glu 


Gly Leu 


Tyr 


Gin 


Gly 


Val 


Pro 


Arg 


Ala 




435 






440 








445 








Glu 


Pro Gly Thr 


Glu Ala 


Arg 


Arg His 


Tyr 


Asp 


Glu 


Gly 


Val 


Arg 


Met 




450 




455 








460 










Gly 


Ser Leu Gly 


Leu Phe 


Leu 


Gin Cys 


Ala 


He 


Ser 


Leu 


Val 


Phe 


Ser 


46S 




470 








475 










480 


Leu 


Val Met Asp 


Arg Leu 


Val 


Gin Arg 


Phe 


Gly 


Thr 


Arg 


Ala 


Val 


Tyr 



84 
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485 490 495 





Leu 


Ala 


Ser 


Val 
500 


Ala 


Ala 


Phe 


Pro 


Val 
505 


Ala 


Ala 


Gly 


Ala 


Thr 
510 


Cys 


5 


Ser 


His 


Ser 
515 


Val 


Ala 


Val 


Val 


Thr 
520 


Ala 


Ser 


Ala 


Ala 


Leu 
525 


Thr 


Gly 




Thr 


Phe 
530 


Ser 


Ala 


Leu 


Gin 


He 
535 


Leu 


Pro 


Tyr 


Thr 


Leu 
540 


Ala 


Ser 


Leu 




His 


Arg 


Glu 


Lys 


Gin 


Val 


Phe 


Leu 


Pro 


Lys 


Tyr 


Arg 


Gly 


Asp 


Thr 


10 


545 










550 










555 












Gly 


Ala 


Ser 


Ser 


Glu 
565 


Asp 


Ser 


Leu 


Met 


Thr 
570 


Ser 


Phe 


Leu 


Pro 


Gly 
575 




Lys 


Pro 


Gly 


Ala 
580 


Pro 


Phe 


Pro 


Asn 


Gly 
585 


His 


Val 


Gly 


Ala 


Gly 
5 90 


Gly 


15 


Gly 


Leu 


Leu 
595 


Pro 


Pro 


Pro 


Pro 


Ala 
600 


Leu 


Cys 


Gly 


Ala 


Ser 
605 


Ala 


Cys 




Val 


Ser 
610 


val 


Arg 


Val 


Val 


Val 
615 


Gly 


Glu 


Pro 


Thr 


Glu 
620 


Ala 


Arg 


Val 




Pro 


Gly 


Arg 


Gly 


He 


Cys 


Leu 


Asp 


Leu 


Ala 


He 


Leu 


Asp 


Ser 


Ala 


20 


625 










630 










635 










Leu 


Leu 


Ser 


Gin 


Val 
645 


Ala 


Pro 


Ser 


Leu 


Phe 
650 


Met 


Gly 


Ser 


He 


Val 
655 




LfiU 


Ser 


Gin 


Ser 
660 


Val 


Thr 


Ala 


Tyr 


Met 

665 


Val 


Ser 


Ala 


Ala 


Gly 
670 


Leu 


25 


Leu 


Val 


Ala 


lie 


Tyr 


Phe 


Ala 


Thr 


Gin 


Val 


Val 


Phe 


Asp 


Lys 


ser 


Leu 


Ala 


675 
Lys 


Tyr 


Ser 


Ala 




680 










685 







690 



<211> 1971 
<212> DNA 

<213> Artificial Sequence 
35 <220> 

<223> DNA encoding Human MUC-1 fused to St pneum. C-LytA P2 helper epitope C-Lyta 
<400> 49 



45 



50 



55 
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atgacaccgg gcacccagtc tcctttcttc 

gttacaggtt ctggtcatgc aagctctacc 

cagagaagtt cagtgcccag ctctactgag 

ctctccagcc acagccccgg ttcaggctcc 

gccccggcca cggaaccagc ttcaggttca 

gtcccagtca ccaggccagc cctgggctcc 

gccccggaca acaagccagc cccgggctcc 

gccccggaca ccaggccgcc cccgggctcc 

gccccggaca ccaggccgcc cccgggctcc 

gccccggaca ccaggccggc cccgggctcc 

gccccggaca acaggcccgc cttggcgtcc 

gcctcaggct ctgcatcagg ctcagcttct 

gctaccacaa ccccagccag caagagcact 

actcctacca cccttgccag ccatagcacc 

acggtacctc ctctcacctc ctccaatcac 

tctttctttt tcctgtcttt tcacatttca 

cccagcaccg actactacca agagctgcag 

tataaacaag ggggttttct gggcctctcc 

gtacaattga ctctggcctt ccgagaaggt 

ttcaatcagt ataaaacgga agcagcctct 

gtgagtgatg tgccatttcc tttctctgcc 

atcgcgctgc tggtgctggt ctgtgttctg 



ctgctgctgc tcctcacagt gcttacagtt 60 
ccaggtggag aaaaggagac ttcggctacc 120 
aagaatgctg tgagtatgac cagcagcgta 180 
tccaccactc agggacagga tgtcactctg 240 
gctgccacct ggggacagga tgtcacctcg 300 
accaccccgc cagcccacga tgtcacctca 360 
accgcccccc cagcccacgg tgtcacctcg 420 
accgcccccc cagcccacgg tgtcacctcg 4 80 
accgcgcccg cagcccacgg tgtcacctcg 540 
accgcccccc cagcccatgg tgtcacctcg 600 
accgcccctc cagtccacaa tgtcacctcg 660 
actctggtgc acaacggcac ctctgccagg 720 
ccattctcaa ttcccagcca ccactctgat 780 
aagactgatg ccagtagcac tcaccatagc 84 0 
agcacttctc cccagttgtc tactggggtc 900 
aacctccagt ttaattcctc tctggaagat 960 
agagacattt ctgaaatgtt tttgcagatt 1020 
aatattaagt tcaggccagg atctgtggtg 1080 
accatcaatg tccacgacgt ggagacacag 1140 
cgatataacc tgacgatctc agacgtcagc 1200 
cagtctgggg ctggggtgcc aggctggggc 1260 
gttgcgctgg ccattgtcta tctcattgcc 1320 



ttggctgtct gtcagtgccg ccgaaagaac 
gatacctacc atcctatgag cgagtacccc 
cctagcagta ccgatcgtag cccctatgag 
ctctcttaca caaacccagc agtggcagcc 
gtacattccg acggetctta tccaaaagac 
tactttgaca gttcaggcta tatgcttgca 
tggtactggt tcgacaactc aggcgaaatg 
tggtactatt tcaacgaaga aggtgccatg 
tggtactact tagacgctaa agaaggcgcc 
attggtatca ctgaaggcgt catggtatca 
ggctggtact acctcaaacc agacggaaca 



tacgggcagc tggacatctt tccagcccgg 1380 
acctaccaca cccatgggcg ctatgtgccc 1440 
aaggtttctg caggtaatgg tggcagcagc 1500 
acttctgcca acttgatggc ggccgcttac 1560 
aagtttgaga aaatcaatgg cacttggtac 1620 
gaccgctgga ggaagcacac agacggcaac 1680 
gctacaggct ggaagaaaat cgctgataag 1740 
aagacaggct gggtcaagta caaggacact 1800 
atgcaataca tcaaggctaa ctctaagttc 1860 
aatgccttta tccagtcagc ggacggaaca 1920 
ctggcagaca ggccagaatg a 1971 



<212> PRT 

<213> Artificial Sequ« 



<220> 

<223> Human MUC-1 fused to St.pneum. C-LytA P2 helper epitope C-Lyta 



<400> 50 
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Met 


Thr 


Pro 


Gly 


Thr 


Gin 


Ser 


Pro 


Phe 


Phe 


Leu 


Leu 


Leu 


Leu 


Leu 


Thr 












5 










10 










15 






Val 


Leu 


Thr 


Val 
20 


Val 


Thr 


Gly 


Ser 


Gly 
25 


His 


Ala 


Ser 


Ser 


Thr 
30 


Pro 


Gly 


5 


Gly 


Glu 


Lys 
35 


Glu 


Thr 


Ser 


Ala 


Thr 
40 


Gin 


Arg 


Ser 


Ser 


val 
45 


Pro 


Ser 


Ser 




Thr 


Glu 
50 


Lys 


Asn 


Ala 


Val 


Ser 
55 


Met 


Thr 


Ser 


Ser 


Val 
60 


Leu 


Ser 


Ser 


His 




Ser 


Pro 


Gly 


Ser 


Gly 


Ser 


Ser 


Thr 


Thr 


Gin 


Gly 


Gin 


Asp 


Val 


Thr 


Leu 


10 


65 










70 










75 










80 




Ala 


Pro 


Ala 


Thr 


Glu 
85 


Pro 


Ala 


Ser 


Gly 


Ser 
90 


Ala 


Ala 


Thr 


Trp 


Gly 
95 


Gin 




Asp 


Val 


Thr 


Ser 
100 


Val 


Pro 


Val 


Thr 


Arg 
105 




Ala 


Leu 


Gly 


Ser 
110 


Thr 


Thr 


15 


Pro 


Pro 


Ala 
115 


His 


Asp 


Val 


Thr 


Ser 
120 


Ala 


Pro 


Asp 


Asn 


Lys 
125 


Pro 


Ala 


Pro 




Gly 


Ser 
130 


Thr 


Ala 


Pro 


Pro 


Ala 
135 


His 


Gly 


val 


Thr 


Ser 
140 


Ala 


Pro 


Asp 


Thr 




Arg 




Pro 


Pro 


Gly 


Ser 


Thr 


Ala 


Pro 


Pro 


Ala 


His 


Gly 


Val 


Thr 


Ser 




145 










150 










155 










160 


Ala 


Pro 


Asp 


Thr 


Arg 
165 


Pro 


Pro 


Pro 


Gly 


Ser 
170 


Thr 


Ala 


Pro 


Ala 


Ala 

175 


His 




Gly 


Val 


Thr 


Ser 
180 


Ala 


Pro 


Asp 


Thr 


Arg 
185 


Pro 


Ala 


Pro 


Gly 


Ser 
190 


Thr 


Ala 


25 


Pro 


Pro 


Ala 


His 


Gly 


Val 


Thr 


Ser 


Ala 


Pro 


Asp 


Asn 


Arg 


Pro 


Ala 


Leu 






195 










200 










205 










Ala 


Ser 
210 


Thr 


Ala 


Pro 


Pro 


Val 
215 


His 


Asn 


Val 


Thr 


Ser 
220 


Ala 


Ser 


Gly 


Ser 




Ala 




Gly 


Ser 


Ala 


Ser 


Thr 


Leu 


val 


Hie 


Asn 


Gly 


Thr 


ser 


Ala 


Arg 




225 










230 










235 










240 


30 


Ala 


Thr 


Thr 


Thr 


Pro 
245 


Ala 


Ser 


Lys 


Ser 


Thr 
250 


Pro 


Phe 


Ser 


lie 


Pro 
255 


Ser 




His 


His 


Ser 


Asp 
260 


Thr 


Pro 


Thr 


Thr 


Leu 
265 


Ala 


Ser 


His 


Ser 


Thr 
270 


Lys 


Thr 




Asp 


Ala 


Ser 


Ser 


Thr 


His 


His 


Ser 


Thr 


Val 


Pro 


Pro 


Leu 


Thr 


Ser 


Ser 


35 






275 










280 










285 










Asn 


His 
290 


Ser 


Thr 


Ser 


Pro 


Gin 
295 


Leu 


Ser 


Thr 


Gly 


Val 
300 


Ser 


Phe 


Phe 


Phe 




Leu 


Ser 


Phe 


His 


lie 


Ser 


Asn 


Leu 


Gin 


Phe 


Asn 


Ser 


Ser 


Leu 


Glu 


Asp 



40 



45 



50 



55 
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310 










Pro Ser 


Thr Asp Tyr Tyr Gin Glu Leu 


Gin 


Arg Asp He 


Ser Glu 


Met 




325 


330 




335 




Phe Leu 


Gin lie Tyr Lys Gin Gly Gly 


Phe 


Leu Gly Leu 




.. 

6 










350 




Phe 


Arg Pro Gly Ser Val Val Val 


Gin 


Leu Thr Leu 




9 




355 360 




365 






Glu Gly 


Thr He Asn Val His Asp Val 


Gl 


Thr Gin Phe 


Asn Gin 


Tyr 
















Glu Ala Ala Ser Arg Tyr Asn 


Leu 
eu 


Thr He Ser 


Asp Val 
a 




385 r 


390 








400 


- 

a er 


Asp Val Pro Phe Pro Phe Ser 


Rl 


n er y 


al n 


3 






410 




* 415 




ro y 


Trp Gly lie Ala Leu Leu Val 


eu 


Val Cys Val 




a 










430 3 




eu a 


He Val Tyr Leu He Ala Leu 


, 

a 


val Cys Gin 


y g 


rg 




435 440 




Al Ar a* 5 






T 


Tyr Gly Gin Leu Asp He Phe 


ro 




Tv 
r Tyr 


is 


^ 4<50 






3 460 8P 








Ser Glu Tyr Pro Tnr Tyr His 




His Gly Arg 


Tyr a 




465 6 


470 




475 




4 80 


Pro Ser 
ro e 




Glu 


Lys Val Ser 


Ala Gly 


Asn 
sn 




485 


4 90 




495 




Gly Gly 




ro 


Ala Val Ala 




er 




500 505 






510 




a sn 


Leu Met Ala Ala Ala Tyr Val 


His 


Ser Asp Gly 


Ser Tyr 


Pro 
ro 








525 






T 


L^s Phe Glu L s He Asn Gl 


Thr 


540 


Phe As 
e sp 


Ser 


^ 530 


ys e u ys sn y 












Tvr Met L u Ala A Ar T 
Tyr e eu a sp g rp 


Arg 




sp y 


Asn 


545 y 










560 


rp Tyr 


S 811 er y U 


M 

J: 


Al Thr Gl 
a r y 


«;■><; 


ys 














e a 


ALT Tv Tv Phe Asn 


Glu 
U 


Glu Gly Ala 


Met Lys 


Thr 
r 




SP 580 ^ r 6 585 






590 




Gly Trp 


Val Lys Tyr Lys Asp Thr Trp 


Tyr 


Tyr Leu Asp 




Glu 




595 600 




605 






Gly Ala 


Met Gin Tyr He Lys Ala Asn 


Ser 


Lys Phe He 


Gly He 


Thr 


610 


615 




620 






Glu Gly 


Val Met Val Ser Asn Ala Phe 


He 


Gin Ser Ala 


Asp Gly 


Thr 


625 


630 




635 




640 


Gly Trp 


Tyr Tyr Leu Lye Pro Asp Gly 


Thr 


Leu Ala Asp 


Arg Pro 


Glu 




645 


650 




655 





45 

<210> 51 
<211>2037 
<212> DNA 

<213> Artificial Sequence 

50 

<220> 

<223> DNA encoding St.pneum. C-LytA P2 helper epitope C-Lyta fused to Human MUC-1 



<400> 51 

55 



EP1 511 768 B1 



atgggatgga gctgtatcat cctcttcttg 
gtccaaatgg cggccgctta cgtacattcc 
aaaatcaatg gcacttggta ctactttgac 
aggaagcaca cagacggcaa ctggtactgg 
tggaagaaaa tcgctgataa gtggtactat 
tgggtcaagt acaaggacac ttggtactac 



gtagcaacag ctacaggtgt ccactcccag 60 
gacggctctt atccaaaaga caagtttgag 120 
agttcaggct atatgcttgc agaccgctgg 180 
ttcgacaact caggcgaaat ggctacaggc 240 
ttcaacgaag aaggtgccat gaagacaggc 300 
ttagacgcta aagaaggcgc catgcaatac 360 



atcaaggcta actctaagtt cattggtatc 
atccagtcag cggacggaac aggctggtac 
aggccagaaa tgacaccggg cacccagtct 
cttacagttg ttacaggttc tggtcatgca 
tcggctaccc agagaagttc agtgcccagc 
agcagcgtac tctccagcca cagccccggt 
gtcactctgg ccccggccac ggaaccagct 
gtcacctcgg tcccagtcac caggccagcc 
gtcacctcag ccccggacaa caagccagcc 
gtcacctcgg ccccggacac caggccgccc 
gtcacctcgg ccccggacac caggccgccc 
gtcacctcgg ccccggacac caggccggcc 
gtcacctcgg ccccggacaa caggcccgcc 
gtcacctcgg cctcaggctc tgcatcaggc 
tctgccaggg ctaccacaac cccagccagc 
cactctgata ctcctaccac ccttgccagc 
caccatagca cggtacctcc tctcacctcc 
actggggtct ctttcttttt cctgtctttt 
ctggaagatc ccagcaccga ctactaccaa 
ttgcagattt ataaacaagg gggttttctg 
tctgtggtgg tacaattgac tctggccttc 
gagacacagt tcaatcagta taaaacggaa 
gacgtcagcg tgagtgatgt gccatttcct 
ggctggggca tcgcgctgct ggtgctggtc 
ctcattgcct tggctgtctg tcagtgccgc 
ccagcccggg atacctacca tcctatgagc 
tatgtgcccc ctagcagtac cgatcgtagc 
ggcagcagcc tctcttacac aaacccagca 



actgaaggcg tcatggtatc aaatgccttt 420 
tacctcaaac cagacggaac actggcagac 480 
cctttcttcc tgctgctgct cctcacagtg 540 
agctctaccc caggtggaga aaaggagact 600 
tctactgaga agaatgctgt gagtatgacc 660 
tcaggctcct ccaccactca gggacaggat 720 
tcaggttcag ctgccacctg gggacaggat 780 
ctgggctcca ccaccccgcc agcccacgat 840 
ccgggctcca ccgccccccc agcccacggt 900 
ccgggctcca ccgccccccc agcccacggt 960 
ccgggctcca ccgcgcccgc agcccacggt 1020 
ccgggctcca ccgccccccc agcccatggt 1080 
ttggcgtcca ccgcccctcc agtccacaat 1140 
tcagcttcta ctctggtgca caacggcacc 1200 
aagagcactc cattctcaat tcccagccac 1260 
catagcacca agactgatgc cagtagcact 1320 
tccaatcaca gcacttctcc ccagttgtct 1380 
cacatttcaa acctccagtt taattcctct 1440 
gagctgcaga gagacatttc tgaaatgttt 1500 
ggcctctcca atattaagtt caggccagga 1560 
cgagaaggta ccatcaatgt ccacgacgtg 1620 
gcagcctctc gatataacct gacgatctca 1680 
ttctctgccc agtctggggc tggggtgcca 1740 
tgtgttctgg ttgcgctggc cattgtctat 1800 
cgaaagaact acgggcagct ggacatcttt 1860 
gagtacccca cctaccacac ccatgggcgc 1920 
ccctatgaga aggtttctgc aggtaatggt 1980 
gtggcagcca cttctgccaa cttgtag 2037 



<210> 52 
<211> 678 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Stpneum. C-LytA P2 helper epitope C-Lyta fused to Human MUC-1 
<400> 52 
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Claims 

10 1. A fusion partner protein comprising a choline binding domain and a heterologous promiscuous T helper epitope. 

2. A fusion partner protein according to claim 1 wherein the choline binding domain is the C terminus of LytA or a 
derivative thereof in which the derivative of the C-terminus of LytA retains both the capability of acting as an immu- 
nological partner and an expression enhancer. 

15 

3. A fusion partner protein according to claim 2 wherein the C-LytA or derivative thereof comprises at least four repeats 
of any ofSEQIDNO:1 to 6. 

4. A fusion partner protein according to any of claims 1 to 3, wherein the choline binding domain is selected from the 
20 group comprising: 

a) the C-terminal domain of LytA as set forth in SEQ ID NO:7; or 

b) the sequence of SEQ ID NO:8; or 

c) a peptide sequence comprising an amino acid sequence having at least 85% identity, preferably at least 90% 
25 identity, more preferably at least 95% identity, most preferably at least 97-99% identity, to any of SEQ ID NO: 

1 to 6: or 

d) a peptide sequence comprising an amino acid sequence having at least 15, 20, 30, 40, 50 or 100 contiguous 
amino acids from the amino acid sequence of SEQ ID NO:7 or SEO ID NO:8, 

3D 5. A fusion protein comprising a fusion partner protein as claimed in any of claims 1 to 4 and a heterologous protein. 

6. A fusion protein as claimed in claim 5 wherein the heterologous protein is chemically conjugated to the fusion partner. 

7. A fusion protein as claimed in claim 5 or 6 wherein the heterologous protein is derived from an organism selected 
35 from the following group: Human Immunodeficiency virus HIV-1 , human herpes simplex viruses, cytomegalovirus, 

Rotavirus, Epstein Barr virus, Varicella Zoster Virus, from a hepatitis virus such as hepatitis B virus, hepatitis A virus, 
_ hepatitis C virus and hepatitis E virus, from Respiratory Syncytial virus, parainfluenza virus, measles virus, mumps 
virus, human papilloma viruses, flaviviruses or Influenza virus, from Neisseria spp, Moraxella spp, Bordetella spp; 
Mycobacterium spp., including M. tuberculosis; Escherichia spp, including enterotoxic E. coli; Salmonella spp,; 
40 Listeria spp; Helicobacter spp; Staphylococcus spp., including S. aureus, S. epidermidis;; Borrelia spp; Chlamydia 

spp., including C. trachomatis, C. pneumoniae; Plasmodium spp., including P. falciparum; Toxoplasma spp. , Can- 
dida spp. 

8. A fusion protein as claimed in claim 5 or 6 wherein the heterologous protein is a tumour associated protein or tissue 
45 specific protein or immunogenic fragment thereof. 

9. A fusion protein as claimed in claim 8 wherein the heterologous protein or fragment thereof is selected from MAGE 
1 , MAGE 3, MAGE 4, PRAME, BAGE, LAGE 1 , LAGE 2, SAGE, HAGE, XAGE, PSA, PAP, PSCA, prostein, P501S, 
HASH2, Cripto, B726, NY-BR1.1, P510, MUC-1, Prostase, STEAP, tyrosinase, telomerase, survivin, CASB616, 

so P53, orher2neu. 

10. A fusion protein as claimed in any of claims 6 to 9 further comprising an affinity tag of at least 4 histidine residues. 

11. A nucleic acid sequence encoding a protein of claim 1 to 10. 

55 

12. An expression vector comprising a nucleic acid sequence of claim 11. 

13. A host cell transformed with a nucleic acid sequence of claim 1 1 or with an expression vector of claim 12. 
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14. An immunogenic composition comprising a protein as claimed in any of claim 1 to 1 0 or a DNA sequence as claimed 
in claim 1 1 and a pharmaceutical^ acceptable excipient. 

15. An immunogenic composition as claimed in claim 14 which additionally comprises a TH-1 inducing adjuvant. 

16. An immunogenic composition as claimed in claim 15 in which the TH-1 inducing adjuvant is selected from the group 
of adjuvants comprising: 3D-MPL, QS21, a mixture of QS21 and cholesterol, a CpG oligonucleotide or a mixture of 
two or more of said adjuvants. 

17. A process for the preparation of a immunogenic composition as claimed in any of claims 14 to 16, comprising 
admixing the fusion protein of any of claims 6 to 10 or the encoding polynucleotide of claim 11 with a suitable 
adjuvant, diluent or other pharmaceutically acceptable carrier. 

18. A process for producing a fusion protein of any of claims 1 to 10 comprising culturing a host cell of claim 13 under 
conditions sufficient for the production of said fusion protein and recovering the fusion protein from the culture medium. 

19. A protein of any of claims 1 to 10 or a DNA sequence of claim 1 1 for use in medicine. 

20. Use of a protein as claimed in any of claim 1 to 10 or a DNA sequence of claim 11 in the manufacture of an 
immunogenic composition for eliciting an immune response in a patient. 

21. Use according to claim 20, wherein said immune response is to be elicited by sequential administration of i) the said 
protein followed by the said DNA sequence; or ii) the said DNA sequence followed by the said protein. 

22. Use according to claim 21 wherein said DNA sequence is coated onto biodegradable beads or delivered via a 
particle bombardment approach. 

23. Use according to claim 21 or claim 22 wherein said protein is adjuvanted. 

24. Use of a protein as claimed in any of claim 1 to 10 or a DNA sequence of claim 11 in the manufacture of an 
immunogenic composition for immunotherapeutically treating a patient suffering from or susceptible to cancer. 

25. Use according to claim 24 wherein said cancer is prostate cancer, colon cancer, lung cancer, breast cancer or 
melanoma. 



Patentanspriiche 

1. Fusionspartnerprotein, das eine Cholin-Bindungsdomane und ein heterologes promiskes T-Helferepitop umfaSt. 

2. Fusionspartnerprotein gemafJ Anspruch 1, worin die Cholin-Bindungsdomane der C-Terminus von LytA oder ein 
Derivat davon ist, worin das Derivat des C-Terminus von LytA sowohl die Fahigkeit zur Funktion als immunologischer 
Partner als auch als Expressionsverstarker bewahrt. 

3. Fusionspartnerprotein gemafi Anspruch 2, worin das C-LytA oder Derivat davon wenigstens vier Repeats aus einem 
beliebigen aus SEQ ID NO: 1 bis 6 umfafit. 

4. Fusionspartnerprotein gemaU einem der Anspruche 1 bis 3, worin die Cholin-Bindungsdomane aus der Gruppe 
ausgewahlt ist, die folgendes umfafit: 

a) die C-terminale Domane von LytA wie in SEQ ID NO: 7 dargestellt; oder 

b) die Sequenz von SEQ ID NO: 8; oder 

c) eine Peptidsequenz, die eine Aminosauresequenz mit wenigstens 85 % Identitat, bevorzugt wenigstens 90 
% Identitat, besonders bevorzugt wenigstens 95 % Identitat, am meisten bevorzugt wenigstens 97-99 % Identitat 
mit einem beliebigen aus SEQ ID NO: 1 bis 6 umfafit; oder 

d) eine Peptidsequenz, die eine Aminosauresequenz mit wenigstens 15, 20, 30, 40, 50 oder 100 zusammen- 
hangenden Aminosauren aus der Aminosauresequenz von SEQ ID NO: 7 oder SEQ ID NO: 8 umfaSt. 
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5. Fusionsprotein, das ein Fusionspartnerprotein gemaB einem der Anspruche 1 bis 4 und ein heterologes Protein 
umfaBt. 

6. Fusionsprotein gemaB Anspruch 5, worin das heterologe Protein chemisch an den Fusionspartner konjugiert ist. 

7. Fusionsprotein gemaB Anspruch 5 Oder 6, worin das heterologe Protein aus einem Organismus stammt, der aus 
der folgenden Gruppe ausgewahlt ist: humanes Immundefizienzvirus HIV-1 , humane Herpes simplex-Viren, Cyto- 
megalovirus, Rotavirus, Epstein-Barr-Virus, Varicella Zoster-Virus, aus einem Hepatitisvirus wie Hepatitis B- Virus, 
Hepatitis A- Virus, Hepatitis C-Virus und Hepatitis E-Virus, aus respiratorischem Synzytialvirus, Parainfluenzavirus, 
Masernvirus, Mumpsvirus, humane Papillomaviren. Flaviviren Oder Influenzavirus, aus Neisseria spp., Moraxella 
spp., Bordetella spp., Mycobaterium spp., einschlieSlich M. tuberculosis; Escherichia spp., einschlieSlich enteroto- 
xisches E. coli; Salmonella spp.; Listeria spp.; Helicobacter spp.; Staphylococcus spp.; einschlieSlich S. aureus, S. 
epidermidis; Borrelia spp.; Chlamydia spp., einschlieSlich C. trachomatis, C. pneumoniae; Plasmodium spp., ein- 
schlieSlich P. falciparum; Toxoplasma spp., Candida spp. 

8. Fusionsprotein gemaB Anspruch 5 Oder 6, worin das heterologe Protein ein Tumor-assoziiertes Protein oder ge- 
webespezifisches Protein oder immunogenes Fragment davon ist. 

9. Fusionsprotein gemaB Anspruch 8, worin das heterologe Protein oder Fragment davon ausgewahlt ist aus MAGE 
1 , MAGE 3, MAGE 4, PRAME, BAGE, LAGE 1, LAGE 2, SAGE, HAGE, XAGE, PSA, PAP, PSCA, Prostein, P501S, 
HASH2, Cripto, B726, NY-BR1 1, P510, MUC-1, Prostase, STEAP, Tyrosinase, Telomerase, Survivin, CASB616, 
P53 oder her 2 neu. 

10. Fusionsprotein gemafieinem der Anspruche6 bis 9, das fernereinenAffinitatsmarkermitwenigstens4Histidinresten 
umfaBt. 

11. Nukleinsauresequenz, die ein Protein gemaS Anspruch 1 bis 10 codiert. 

12. Expressionsvektor, dereine Nukleinsauresequenz gemaS Anspruch 11 umfaBt. 

13. Wirtszelle, die miteiner Nukleinsauresequenz gemaS Anspruch 11 oder mit einem Expressionsvektor gemSS An- 
spruch 12 transformiert ist. 

14. Immunogene Zusammensetzung, die ein Protein gemaB einem der Anspruche 1 bis 10 oder eine DNA-Sequenz 
gemaS Anspruch 1 1 und einen pharmazeutisch akzeptablen Exzipienten umfaBt. 

15. Immunogene Zusammensetzung gemaB Anspruch 14, die zusatzlich einen TH-1-induzierenden Hilfsstoff umfaBt. 

16. Immunogene Zusammensetzung gemaS Anspruch 15, worin derTH-1-induzierende Hilfsstoff aus der Gruppe von 
Hilfsstoffen ausgewahlt ist, die 3D-MPL, QS21, eine Mtschung ausQS21 und Cholesterol, ein CpG-Oligonukleotid 
oder eine Mischung aus zwei oder mehreren der Hilfsstoffe umfaBt. 

17. Verfahren zur Herstellung einer immunogenen Zusammensetzung gemaB einem der Anspruche 14 bis 16, das das 
Vermischen des Fusionsproteins gemaB einem der Anspruche 6 bis 1 0 oder des codierenden Polynukleotids gemaB 
Anspruch 1 1 mit einem geeigneten Hilfsstoff, Verdunnungsmittel oderanderen pharmazeutisch akzeptablen Trager 
umfaBt. 

18. Verfahren zur Herstellung eines Fusionsproteins gemaB einem der Anspruche 1 bis 10, das das Kultivieren einer 
Wirtszelle gemaB Anspruch 13 unter Bedingungen. die ausreichend zur Herstellung des Fusionsproteins sind, und 
das Gewinnen des Fusionsproteins aus dem Kulturmedium umfaBt. 

19. Protein gemaB einem der Anspruche 1 bis 10 oder DNA-Sequenz gemaB Anspruch 11 zur Verwendung in der 
Medizin. 

20. Verwendung eines Proteins gemaB einem der Anspruche 1 bis 10 oder einer DNA-Sequenz gemaB Anspruch 1 1 
in der Herstellung einer immunogenen Zusammensetzung zum Hervorrufen einer Immunreaktion in einem Patienten. 

21. Verwendung gemaB Anspruch 20, worin die Immunreaktion durch aufeinanderfolgende Verabreichung i) des Pro- 
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teins, gefolgt von der DNA-Sequenz; oder ii) der DNA-Sequenz, gefolgt vom Protein hervorgerufen wird. 

22. Verwendung gemaS Anspruch 21 , worin die DNA-Sequenz auf bioiogisch abbaubaren Perlen aufgetragen ist oder 
uber einen Partikelbombardierungsansatz ubertragen wird. 

5 

23. Verwendung gemaft Anspruch 21 oder 22, worin das Protein mit Hilfsstoff versetzt ist. 

24. Verwendung eines Proteins gemafi einem der Anspruche 1 bis 10 oder einer DNA-Sequenz gemaS Anspruch 1 1 
in der Herstellung einer immunogenen Zusammensetzung zurimmuntherapeutischen Behandlung eines Patienten, 

io der an Krebs leidet oder dafur anfallig ist. 

25. Verwendung gemali Anspruch 24, worin der Krebs Prostatakrebs, Darmkrebs, Lungenkrebs, Brustkrebs oder Me- 
lanom ist. 

15 

Revendications 

1 . Proteine partenaire de fusion comprenant un domaine de liaison a la choline et un epitope de lymphocyte T auxiliaire 
multivalent. 

20 

2. Proteine partenaire de fusion selon la revendication 1 , dans laquelle le domaine de liaison a la choline est I'extremite 
C-terminale de LytA ou un derive de celle-ci, ou le derive de I'extremite C-terminale de LytA conserve la capacite 
d'agir a la fois en tant que partenaire immunogene et stimulateur de I'expression. 

25 3. Proteine partenaire de fusion selon la revendication 2, dans laquelle le C-LytA ou un derive de celle-ci comprend 
au moins quatre repetitions de Tune quelconque des SEQ ID N° 1 a 6. 

4. Proteine partenaire de fusion selon I'une quelconque des revendications 1 a 3, dans laquelle le domaine de liaison 
a la choline est selectionne dans le groupe consistant en : 

30 

a) le domaine C-terminal de LytA tel que represents par SEQ ID N°7 ; ou 

b) la sequence de SEQ ID N" 8 ; ou 

c) une sequence peptldique comprenant une sequences d'acides amines ayant au moins 85 % d'identite, de 
preference au moins 90 % d'identite, de maniere plus preferee au moins 95 % d'identite, et de la maniere la 

35 plus preferee entre toutes au moins 97-99 % d'identite, avec I'une quelconque des SEQ ID N° 1 a 6 ; ou 

d) une sequence peptidique comprenant une sequence d'acides amines ayant au moins 15, 20, 30, 40, 50 ou 
100 acides amines contigus de la sequence d'acides amines de SEQ ID N° 7 ou SEQ ID N° 8. 

5. Proteine de fusion comprenant une proteine partenaire de fusion selon I'une quelconque des revendications 1 a 4, 
40 et une proteine heterologue. 

6. Proteine de fusion selon la revendication 5, dans laquelle la proteine heterologue est chimiquement conjuguee au 
partenaire de fusion. 

45 7. Proteine de fusion selon la revendication 5 ou 6, dans laquelle la proteine heterologue est derivee d'un organisme 
selectionne dans le groupe suivant: virus de limmunodeficience humaine HIV-1, virus herpes simplex humains, 
cytomegalovirus, rotavirus, virus d'Epstein-Barr, virus varicelle-zona, a partir d'un virus de I'hepatite tel que le virus 
de I'hepatite B, virus de I'hepatite A, virus de I'hepatite C et virus de I'hepatite E, a partir d'un virus respiratoire 
syncytial, virus parainfluenza, virus de la rougeole, virus des oreillons, des virus du papillome humain, des flavivirus 

so ou virus de la grippe, a partir des especes Neisseria, especes Moraxella, especes Bordetella ; especes 

Mycobacterium , comprenant M. tuberculosis; des especes Escherichia, comprenant E. coli enterotoxique; especes 
Salmonella; especes Listeria; especes Helicobacter; especes Staphylococcus; comprenant S. aureus, S. 
epidermidis ; especes Borrelia ; especes Chlamydia, comprenant C. trachomatis, C. pneumoniae ; especes Plas- 
modium, comprenant P. falciparum ; especes Toxoplasma, especes Candida. 

55 

8. Proteine de fusion selon la revendication 5 ou 6, dans laquelle la proteine heterologue est une proteine associee a 
une tumeur ou une proteine specifique a un tissu ou un fragment immunogene de celle-ci. 
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9. Proteine de fusion selon la revendication 8, dans laquelle la proteine heterologue, ou un fragment de celle-ci, est 
selectionnee parmi MAGE 1, MAGE 3, MAGE 4, PRAME, BAGE, LAGE 1, LAGE 2, SAGE, HAGE, XAGE, PSA, 
PAP, PSCA, prosteine, P501S, HASH2, Cripto, B726, NY-BR1.1, P510, MUC-1, Prostase, STEAP, tyrosinase, 
telomerase, survivine, CASB616, P53, ou her 2 neu. 

5 

10. Proteine de fusion selon I'une quelconque des revendications 6 a 9 comprenant en outre un marqueur d'affinite 
d'au moins 4 residus histidine. 

11. Sequence d'acide nucleique codant pour une proteine selon la revendication 1 a 10. 

10 

12. Vecteur d'expression comprenant une sequence d'acide nucleique selon la revendication 1 1 . 



13. Cellule h6te transformee avec une sequence d'acide nucleique selon la revendication 1 1 ou avec un vecteur d'ex- 
pression selon la revendication 12. 

15 

14. Composition immunogene comprenant une proteine selon I'une quelconque des revendications 1 a 10 ou une 
sequence ADN selon la revendication 11 et un excipient pharmaceutiquement acceptable. 

15. Composition immunogene selon la revendication 14 qui comprend en outre un adjuvant induisant TH-1. 

20 

16. Composition immunogene selon la revendication 15, dans laquelle I'adjuvant induisant TH-1 est selectionne parmi 
le grouped'adjuvants comprenant :3D-MPL, QS21, un melange de QS21 etde cholesterol, un oligonucleotide CpG 
ou un melange de deux ou plusieurs desdits adjuvants. 

2$ 17. Procede pour la preparation d'une composition immunogene selon I'une quelconque des revendications 14 a 16, 
comprenant un melange de la proteine de fusion selon I'une quelconque des revendications 6 a 10, ou du polynu- 
cleotide codant selon la revendication 1 1 , avec un adjuvant approprie, un diluant ou autre transporter pharmaceu- 
tiquement acceptable. 

3D 18. Procede de production d'une proteine de fusion selon I'une quelconque des revendications 1 a 10 comprenant la 
mise en culture d'une cellule note selon la revendication 13 dans des conditions suffisantes pour produire ladite 
proteine de fusion et recuperer la proteine de fusion dans le milieu de culture, 

19. Proteine selon I'une quelconque des revendications 1 a 10ou une sequence ADN selon la revendication 1 1 destinee 
35 a une utilisation en medecine. 



20. Utilisation d'une proteine selon I'une quelconque des revendications 1 a 10 ou une sequence ADN selon la reven- 
dication 1 1 dans la fabrication d'une composition immunogene pour provoquer une reponse immunitaire chez un 
patient. 

40 

21. Utilisation selon la revendication 20, dans laquelle ladite reponse immunitaire doit etre provoquee par une adminis- 
tration sequentielle de i) ladite proteine suivie par ladite sequence ADN ; ou ii) ladite sequence ADN suivie par ladite 
proteine. 

45 22. Utilisation selon la revendication 21 , dans laquelle ladite sequence ADN est enrobee sur des microspheres biode- 
gradables ou est delivree par une approche de bombardement particulate. 

23. Utilisation selon la revendication 21 ou la revendication 22, dans laquelle ladite proteine est associee a un adjuvant. 

so 24. Utilisation d'une proteine selon I'une quelconque des revendications 1 a 10 ou une sequence ADN selon la reven- 
dication 1 1 dans la fabrication d'une composition immunogene pour un traitementd'immunotherapie chez un patient 
presentant un cancer ou etant susceptible de presenter un cancer. 

25. Utilisation selon la revendication 24, dans laquelle ledit cancer est le cancer de la prostate, le cancer du colon, le 
55 cancer du sein ou le melanome. 
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Fig. 1 - Sequence information for C-LytA. 

SEQ ID NO:1 - amino acid sequence of C-LytA repeat 1 

GWQKNDTGYWYVHSD 15 

SEQ ID NO:2 - amino acid sequence of C-LytA repeat 2 

GSYPKDKFEKINGTWYYFDSS 21 

SEQ ID NO:3 - amino acid sequence of C-LytA repeat 3 

GYMLADRWRKHTDGNWYWFDNS 22 

SEQ ID NO:4 - amino acid sequence of C-LytA repeat 4 

GEMATGWKKIADKWYYFNEE 20 

SEQ ID NO:5 - amino acid sequence of C-LytA repeat 5 

GAMKTGWVKYKDTWYYLDAKE 21 

SEQ ID NO:6 - amino acid sequence of C-LytA repeat 6 

GAMVSNAF IQS ADGTGWYYLKPD 23 

SEQ ID NO:7 - amino acid sequence of C-LytA cholin-binding domain 

GWQKNDTGYW YVHSDGSYPK DKFEKINGTW YYFDSSGYML ADRWRKHTDG NWYWFDNSGE 60 
MATGWKKIAD KWYYFNEEGA MKTGWVKYKD TWYYLDAKEG AMVSNAFIQS ADGTGWYYLK 120 
PDGTLADRPE FTVEPDGLIT VK 142 

SEQ ID NO:8 - amino acid sequence of C-LytA domain from truncated repeat 1 to repeat 
6 (as part of our constructs shown in figure 2) 

YVHSDGSYPKDKFEKINGTWYYFDSSGYMIADRWRKHTDGOT 
GWVKYKDTWYYLDAKEGAMVSKAFIQSWOGTGWryiOCPD 



SEQ ID NO:9 - DNA sequence encoding the amino acid sequence of SEQ ID NO:1 
ggctggcaga agaatgacac tggctactgg tacgtacatt cagac 

SEQ ID NO: 10 - DNA sequence encoding the amino acid sequence of SEQ ID NO:2 

ggctcttatc caaaagacaa gtttgagaaa atcaatggca cttggtacta ctttgacagt tea 
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SEQ ID NO:1 1 - DNA sequence encoding the amino acid sequence of SEQ ID NO:3 

ggctatatgc ttgcagaccg ctggaggaag cacacagacg gcaactggta ctggttcgac aactca 

SEQ ID NO: 12 - DNA sequence encoding the amino acid sequence of SEQ ID NO:4 

ggcgaaatgg ctacaggctg gaagaaaatc gctgataagt ggtactattt caacgaagaa 

SEQ ID NO: 13 - DNA sequence encoding the amino acid sequence of SEQ ID NO:5 

Ggtgccatga agacaggctg ggtcaagtac aaggacactt ggtactactt agacgctaaa gaa 

SEQ ID NO: 14 - DNA sequence encoding the amino acid sequence of SEQ ID NO:6 

Ggcgccatgg tatcaaatgc ctttatccag tcagcggacg gaacaggctg gtactacctc 
aaaccagac 

SEQ ID NO: 1 5 - DNA sequence encoding the amino acid sequence of SEQ ID NO:7 

ggctggcaga agaatgacac Cggctactgg tacgtacact cagacggctc ttatccaaaa 60 

gacaagtttg agaaaatcaa tggcacttgg tactactttg acagttcagg ctatatgctt 120 

gcagaccgct ggaggaagca cacagacggc aactggtact ggttcgacaa ctcaggcgaa 180 

atggctacag gctggaagaa aatcgctgat aagtggtact atttcaacga agaaggtgcc 240 

atgaagacag gctgggtcaa gtacaaggac acttggtact acttagacgc taaagaaggc 300 

gccatggtat caaatgcctt tatccagtca gcggacggaa caggctggta ctacctcaaa 360 

ccagacggaa cactggcaga caggccagaa ttcacagtag agccagatgg cttgattaca 420 
gtaaaataa 429 

SEQ ID NO: 16 - DNA sequence encoding the amino acid sequence of SEQ ID NO:8 

TACGTACATTCCGACGX3CTCTTATCCAAAAGACAAGTTTGAGAAAATCAATGGCACTTGGTACTACTTTGACA 
GTTCAGGCTATATGCTTGCAGACCGCTGGAGGAAGCACACAGACGGCAACTGGTACTGGTTCGACAACTCAGG 
CGAAATGGCTACAGGCTGGAAGAAAATCGCTGATAAGTGGTACTATTTCAACGAAGAAGGTGCCATGAAGACA 
GGCTGGGTCAAGTACAAGGACACTTGGTACTACTTAGACGCTAAAGAAGGCGCCATGGTATCAAATGCCTTTA 
TCCAGTCAGCGGACGGAACAGGCTGGTACTACCTCAAACCAGAC 
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FIG. 2. CPC and native Constructs 

Construct 1 - coding sequence of CPC-P5Q1^ (see plasmid of figure 7 -Y1796) 
Protein sequence {SEQ ID NO:27) 

Rl R2 R3 R4 

MAAA ^WSDGSWKDKFEKINGTWYYFDSSGYMLADRWRKHTDGNWYWFD 

R5 Sg R6 

^G^^^^S^TLADRPEKJWMVLGIGPVLGLVCVPLLGSASDHWRGRYGRRRPFrWALSL 
GILLSLFLIPRAGWLAGLLCPDPRPLELALLILGVGLLDFCGQVCFTPLEALLSDLFRDPDHCRQAYSV 
YAFMISLGGCLGYLLPAIDWDTSALAPYLGTQEBCLFGLLTL1FLTCVAATLLVAEEAALGPTEPAEG 
LSAPSI^PHCCPCRAJRXAFRNLGALLPRLHQLCCPJ^RTLRRLFVAELCSWMALMTFTLFYTDFVGE 
GLYQGWRAEPGTFARRHYDEGVRMGSLGLFLQCAJSLVFSLVMDRLVQRFGTRAVYLASVAAFPV 
AAGATCLSHSVAVVTASAALTGFTFSALQILPYTLASLYHREKQVFLPKYRGDTGGASSEDSLMTSF 
LPGPKPGAPFPNGHVGAGGSGLLPPPPALCGASACDVSVRWVGEPTEARVWGRGICLDLAILDSAF 
LLSQVAPSLFMGSIVQLSQSVTAYMVSAAGLGLVAIYFATQVVFDKSDLAKYSAGGHHHHHH 

R1 (plain): aa5-9 (fragment) R4 (bold): aa53-72 P2 (underline): 97-1 10 

R2 (bold): aa1 0-30 R5 (plain): aa73-93 

R3 (plain): aa31 -52 R6a (bold): aa94-95 R6b (bold): 1 1 3-1 33 

Nucleotide sequence (SEQ ID NO:28) 

ATGgcggccgctTACGTACATTCCGACGGCTCTTATC^^ 

ACTACTTTGACAGTTCAGGCTATATGCTTGCAGACCGCTGGAGGAAGCACACAGACGGCAACTGGTACTGGTT 
CGACAACTCAGGCGAAATGGCTACAGGCTGGAAGAAAATCGCTfGATAAGTGGTACXATTTCAACGAAGAAGGX 
GCCATGAAGACAGGCTGGGTCAAGTACAAGGACACTTGGTACTACTTAGACGCTAAAGAAGGCGCCatgc^^t 
acatcaaqqctaactctaa qttcattqqtatcactqaaqqcqtcATGGTATaiAATGCCTTTATCCAGTCAGC 
GGACGGAACAGGCTGGTACTACCTCAAACCAGACGGAACACTGGCAGACAGGCCAGAAaagttcatgtaCatg 
GTGCTGGGCATTGGTCCAGTGCTGGGCCTGGTCTGTGTCCCGCTCCTAGGCTCAGCCAGTGACCACTGGCGTG 
GACGCTATGGCCGCCGCCGGCCCTTCATCTGGGCACTGTCCTTGGGCATCCTGCTGAGCCTCTTTCTCATCCC 
AAGGGCCGGCTGGCTAGCAGGGCTGCTGTGCCCGGATCCCAGGCCCCTGGAGCTGGCACTGCTCATCCTGGGC 

ACCCGGACC^CTGTCGCCAGGCCTACTCTGTCTATGCCTTCATGATCAGTCTTGGGGGCTGCCTGGGCTACCT 
CCTGCCTGCCATTGACTGGGACACCAGTGCCCTGGCCCCCTACCTGGGCACCCAGGAGGAGTGCCTCTTTGGC 
CTGCTCACCCTCATCTTCCTCACCTGCGTAGCAGCCACACTGCTGGTGGCTGAGGAGGCAGCGCTGGGCCCCA 
CCGAGCCAGCAGAAGGGCTGTCGGCCCCCTCCTTGTCGCCCCACTGCTGTCCATGCCGGGCCCGCTTGGCTTT 
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CCGGAACCTGGGCGCCCTGCTTCCCCGGCTGCACCAGCTGTGCTGCCGCATGCCCCGCACCCTGCGCCGGCTC 
TTCGTGGCTGAGCTGTGCAGCTGGATGGCACTCATGACCTTCACGCTGTTTTACACGGATTTCGTGGGCGAGG 
GGCTGTACCAGGGCGTGCCCAGAGCTGAGCCGGGCACCGAGGCCCGGAGACACTATGATGAAGGCGTTCGGAT 




GCTAGCAGTGAGGACAGCCTGATGACCAGCTTCCTGCCAGGCCCTAAGCCTGGAGCTCCCTTCCCTAATGGAC 
ACGTGGGTGCTGGAGGCAGTGGCCTGCTCCCACCTCCACCCGCGCTCTGCGGGGCCTCTGCCTGTGAtGTCTC 
CGTACGTGTGGTGGTGGGTGAGCCCACCGAGGCCAGGGTGGTTCCGGGCCGGGGCATCTGCCTGGACCTCGCC 
ATCCTGGATAGTGCCTTCCTGCTGTCCCAGGTGGCCCCATCCCTGTTTATGGGCTCCATTGTCCAGCTCAGCC 



AGTCTGTCACTGCCTATATGGTGTCTGCCGCAGGCCTGGGTCTGGTCGCCATTTACTTTGCTACACAGGTAGT 
ATTTGACAAGAGCGACTTGGCCAAATACTCAGCGggtggacaccatcaccatcaccattaa 

Construct 2 - Coding sequence of PSOlg s asa HIS (control) (yeast strain SC333 ) 
Protein sequence (SEQ ID NO:29) 



MVLGIGPVLG LVCVPLLGSA SDHWRGRYGR 
PDPRPLEXAL LILGVGLLDF CGQVCFTPLE 
GYLLPAIDWD TSAJLAPYLGT QEECLFGLLT 
PSLSPHCCPC RARIAFRNLG ALLPRLHQLC 
FVGEGliYQGV PRAEPGTEAR RHYDEGVRMG 
YLASVAAFPV AAGATCLSHS VAWTASAAL 
RGDTGGASSE DSLMTSFLPG PKPGAPFPNG 
GEPTEARWP GRGICLDLAI UJSAFIiLSQV 
IYFATQWFD KSDLAKYSAG GHHHHHH 5( 



RRPFIWALSL GILLSLFLIP RAGWIAGLLC 60 

ALLSDLFRDP DHCRQAYSVY AFMISLGGCL 120 

LIFIiTCVAAT LLVAEEAALG PTEPAEGLSA 180 

CRMPRTLRRL FVAELCSWMA LMTFTIiFYTD 240 

SIiGIiFIiQCAZ SLVFSLVMDR LVQRFGTRAV 300 

TGFTFSALQI LPYTLASLYH REKQVFLPKY 360 

HVGAGGSGLL PPPPALCGAS ACDVSVRVW 420 

APSLFMGSIV QLSQSVTAYM VSAAGLGIjVA 480 



Nucleotide sequence (SEQ ID NO:30) 



atgGTGCTGG GCATTGGTCC AGTGCTGGGC 
AGTGACCACT GGCGTGGACG CTATGGCCGC 
GGCATCCTGC TGAGCCTCTT TCTCATCCCA 
CCGGATCCCA GGCCCCTGGA GCTGGCACTG 
TGTGGCCAGG TGTGCTTCAC TCCACTGGAG 
GACCACTGTC GCCAGGCCTA CTCTGTCTAT 
GGCTACCTCC TGCCTGCCAT TGACTGGGAC 
CAGGAGGAGT GCCTCTTTGG CCTGCTCACC 
CTGCTGGTGG CTGAGGAGGC AGCGCTGGGC 
CCCTCCTTGT C GCCCCACTG CTGTCCATGC 



CTGGTCTGTG TCCCGCTCCT AGGCTCAGCC 60 

CGCCGGCCCT TCATCTGGGC ACTGTCCTTG 120 

AGGGCCGGCT GGCTAGCAGG GCTGCTGTGC 180 

CTCATCCTGG GCGTGGGGCT GCTGGACTTC 240 

GCCCTGCTCT CTGACCTCTT CCGGGACCCG 300 

GCCTTCATGA TCAGTCTTGG GGGCTGCCTG 360 

ACCAGTGCCC TGGCCCCCTA CCTGGGCACC 420 

CTCATCTTCC TCACCTGCGT AGCAGCCACA 4 80 

CCCACCGAGC CAGCAGAAGG GCTGTCGGCC 540 

CGGGCCCGCT TGGCTTTCCG GAACCTGGGC 600 
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GCCCTGCTTC 


CCCGGCTGCA 


CCAGCTGTGC 


TGCCGCATGC 


CCCGCACCCT 


GCGCCGGCTC 


660 


TTCGTGGCTG 


AGCTGTGCAG 


CTGGATGGCA 


CTCATGACCT 


TCACGCTGTT 


TTACACGGAT 


720 


TTCGTGGGCG 


AGGGGCTGTA 


CCAGGGCGTG 


CCCAGAGCTG 


AGCCGGGCAC 


CGAGGCCCGG 


780 


AGACACTATG 


ATGAAGGCGT 


TCGGATGGGC 


AGCCTGGGGC 


TGTTCCTGCA 


GTGCGCCATC 


840 


TCCCTGGTCT 


TCTCTCTGGT 


CATGGACCGG 


CTGGTGCAGC 


GATTCGGCAC 


TCGAGCAGTC 


900 


TATTTGGCCA 


GTGTGGCAGC 


TTTCCCTGTG 


GCTGCCGGTG 


CCACATGCCT 


GTCCCACAGT 


960 


GTGGCCGTGG 


TGACAGCTTC 


AGCCGCCCTC 


ACCGGGTTCA 


CCTTCTCAGC 


CCTGCAGATC 


1020 


CTGCCCTACA 


CACTGGCCTC 


CCTCTACCAC 


CGGGAGAAGC 


AGGTGTTCCT 


GCCCAAATAC 


1080 


CGAGGGGACA 


CTGGAGGTGC 


TAGCAGTGAG 


GACAGCCTGA 


TGACCAGCTT 


CCTGCCAGGC 


1140 


CCTAAGCCTG 


GAGCTCCCTT 


CCCTAATGGA 


CACGTGGGTG 


CTGGAGGCAG 


TGGCCTGCTC 


1200 


CCACCTCCAC 


CCGCGCTCTG 


CGGGGCCTCT 


GCCTGTGAtG 


TCTCCGTACG 


TGTGGTGGTG 


1260 


GGTGAGCCCA 


CCGAGGCCAG 


GGTGGTTCCG 


GGCCGGGGCA 


TCTGCCTGGA 


CCTCGCCATC 


1320 


CTGGATAGTG 


CCTTCCTGCT 


GTCCCAGGTG 


GCCCCATCCC 


TGTTTATGGG 


CTCCATTGTC 


1380 


CAGCTCAGCC 


AGTCTGTCAC 


TGCCTATATG 


GTGTCTGCCG 


CAGGCCTGGG 


TCTGGTCGCC 


1440 


ATTTACTTTG 


CTACACAGGT 


AGTATTTGAC 


AAGAGCGACT 


TGGCCAAATA 


CTCAGCGggt 


1500 


ggacaccatc 


accatcacca 


ttaa 1524 











Construct 3 - Coding sequence of natssP501i . ^ P501maw HIS (yeast strain Y1800) 
Protein sequence (SEQ ID NO:31) 

Rl R2 

maavqrlwvsrllrhrkaqlllvnlltfglevclaaa |yvhsdgsyfkpkfekingt\M 

R3 R4 R5 

fYYFDSSGmLADRWRKj-rroGNWYWFDNSGEMATGWKlOAPKWYYFNEEGAMKTGWVKj 

£L R6 
|YKDTWYYLDAKEGAj MOYIKANSKFIGITEGV iMVSNAF1^ 

MVLGIGPVLGLVCVPLLGSASDHWRGRYGRRRPFIWALSLGILLSLFLIPRAGWLAGLLCPDPRPLEL 
ALLILGVGLLDFCGQVCFTPLEALLSDLFRDPDHCRQAYSVYAFMISLGGCLGYLLPAIDWDTSALAP 
YLGTQEECLFGLLTLIFLTCVAATLLVAEEAALGPTEPAEGLSAPSLSPHCCPCRARLAFRNLGALLPR 
LHQLCCPvMPRTLRRLFVAELCSWMAm^ 

SLGLFLQCA1SLWSLVMDRLVQPJFGTRAWLASVAAFPVAAGATCLSHSVAVVTASAALTGFTFSA 
LQILPYTIASLYHREKQWLPKYRGDTGGASSEDSLMTSFLPGPKPGAPFPNGHVGAGGSGLLPPPPA 
LCGASACDVSVRVWGEPTEARWPGRGICLDLAILDSAFLLSQVAPSLFMGSIVQLSQSVTAYMVS 
AAGLGLVAIYFATQWFDKSDLAKYSAGGHHHHHH 

R1 (plain): aa38-42 (fragment) R4 (bold): aa77-1 06 P2 (underline): 1 30-1 43 

R2 (bold): aa43-64 R5 (plain): aa107-1 26 

R3 (plain): aa65-76 R6a(bold): aa127-128 R6b (bold): aa 146-1 66 
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natss stands for native signal sequence 
Nucleotide sequence (SEQ ID NO:32) 

ATG^CGGCCGTGCAGAGGCTATGGGTATCGAGACTGCTAAGACACCGCyUVAGCTCAGTTGTTGTTGGTTAACT 
TGTTGACCTTCGGGCTGGAAGTCTGTTTGGCggccgctTACGTACATTCCGACGGCTCTTATCCAAAAGACAA 
GTTTGAGAAAATC^TGGCAC^GGTACT^^ 

AGTGGTACTATTTCAACGAAGAAGGTGCGATGAAGACAGGCTGGGTCAAGTACAAGGACAC 

AOACGCTAAAGAAGGCGCCatgcaayicatcaaqqctaactctaaqttca^^ 

GTATCAAATGCCITTATCCAGTCAGCGGACGGAACAGGCTGGTACTACCTCAAACCAGACGGAACACTGGCAG 

ACAGGCC^GAAaagttcatgtaCatgGTGCTGGGCATTGGTCCAGTGCTGGGCCTGGTCTGTGTCCCGCTCCT 
AGGCTCAGCCAGTGACCACTGGCGTGGACGCTATGGCCGCCGCCGGCCCTTCATCTGGGCACTGTCCTTGGGC 
ATCCTGCTGAGCCTCTTTCTCATCCCAAGGGCCGGCTGGCTAGCAGGGCTGCTGTGCCCGGATCCCAGGCCCC 
TGGAGCTGGCACTGCTCATCCTGGGCGTGGGGCTGCTGGACTTCTGTGGCCAGGTQTGCTTCACTCC^CrGGA 
GGCCCTGCTCTCTGACCTCTTCCGGGACCCGGACCACTGTCGCCAGGCCTACTCTGTCTATGCCTTCATGATC 
AGTCTTGGGGGCTGCCTGGGCTACCTCCTGCCTGCCATTGACTGGGACACCAGTGCCCTGGCCCCCTACCTGG 
GCACCCAGGAGGAGTGCCTCTTTGGCCTGCTCACCCTCATCTTCCTCACCTGCGTAGCAGCCACACTGCTGGT 
GGCTGAGGAGGCAGCGCTGGGCCCCACCGAGCCAGCAGAAGGGCTGTCGGCCCCCTCCTTGTCGCCCCACTGC 
TGTCCATOCCGGGCCCGCTTGGCTTrCCGGAACCTGGGCGCCCTGCTTCCCCGGCTGCACCAGCTGTGCTGCC 
GCATGCCCCGCACCCTGCGCCGGCTCTTCGTGGCTGAGCTGTGCAGCTGGATGGCACTCATGACCTTCACGCT 
GTTTTACACGGATTTCGTGGGCGAGGGGCTGTACCAGGGCGTGCCCAGAGCTGAGCCGGGCACCGAGGCCCGG 
AGACACTATGATGAAGGCGTTCGGATGGGCAGCCTGGGGCTGTTCCTGCAGTGCGCCATCTCCCTGGTCTTCT 
CTCTGGTCATGGACCGGCTGGTGCAGCGATTCGGCACTCGAGCAGTCTATTTGGCCAGTGTGGCAGCTTTCCC 

ACCTTCTCAGCCCTGCAGATCCTGCCCTACACACTGGCCTCCCTCTACCACCGGGAGAAGCAGGTGTTCCTGC 
CCAAATACCGAGGGGACACTGGAGGTGCTAGCAGTGAGGACAGCCTGATGACCAGCTTCCTGCCAGGCCCTAA 
GCCTGGAGCTCCCTTCCCTAATGGACACGTGGGTGCTGGAGGCAGTGGCCTGCTCCCACCTCCACCCGCGCTC 

GCCGGGGCATCTGCCTGGACCTCGCCATCCTGGATAGTGCCTTCCTGCTGTCCCAGGTGGCCCCATCCCTGTT 
TATGGGCTCCATTGTCCAGCTCAGCCAGTCTGTCACTGCCTATATGGTGTCTGCCGCAGGCCTGGGTCTGGTC 
GCCATTTACTTTGCTACACAGGTAGTATTTGACAAGAGCGACTTGGCCAAATACTCAGCGggtggacaccatC 
accatcaccattaa 

Construct 4 - Coding sequence of aiprtapreCPC-P501 S i.s53 HIS (yeast strain Y1802) 
Protein sequence (SEQ ID NO:33) 

Alpha-pre signal Rl R2 R3 



MAARgPSIFTAVLgAASSAIiAAA {Y\^SDGSYPKDK?BKIMGTWYlfFPSSGYNIADR 

R4 R5 gg 

[SSGBMATGWKKIAPKWYYyNEEGAMKTGWVKYKDTWYY^^^ 
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R6 

[QSAPGTGWYYLKPI^ GTLADRPEKFMYMVLG I G PVIiGLVCVPI.LGS ASDHWRGRYGRRRP F I W ALSLG I LLSLF 

LIPRAGWIAGLLCPDPRPLELALLILGVGLLDFCGQVCFTP1.EALLSDLFRDPDHCRQAYSVYAFMISLGGCL 

GYLLPAIDWDTSALAPYLGTQBECLFGLIiTLIFIjTCVAATLLVAEEAALGPTSPAEGLSAPSLSPHCCPCRAR 

IAFRm,GAIil.PRIiHQI.CCRMPRTIJ«LFVAEI,CSW^^ 

VRMGSLGLFLQCAISLVFSLVMDRLVQRFGTRAVYIASVAA^^ 

ILPYTLASLYHREKQVFLPKYRGDTGGASSEDSLMTSFLPGPKPGAPFPNGHVGAGGSGLLPPPPALCGASAC 
DVSVRVWGEPTEARVVPGRGICLDLAILDSAFLLSQVAPSLFMGSIVQliSQSVTAYMVSAAGLGLVAIYFAT 
QWFDKSDLAKYSAGGHHHHHH 

Alpha-pre signal (bold): aa4-22 

R1 (plain): aa24-28 (fragment) R4 (bold): aa72-91 P2 (underline): 1 1 6-1 29 

R2 (bold): aa29-49 R5 (plain): aa92-1 1 2 

R3 (plain): aa50-71 R6a (bold): aa1 13-114 R6b (bold): aa132-152 

Alphapre stands for alpha pre signal sequence 

Nucleotide sequence (SEQ ID NO:34) 

TACGTACATTCCGACGGCTCTTATCCAAAAGACAAGlTTGAGAAAATCAATGGCACTTGGTACrACrTTOACA 

CGAAATGGCTACAGGCTGGAAGAAAATCGCTGATAAGTGGTACTATTTCAACGAAGAAGGTGCCATGAAGACA 
GGCTGGGTCAAGTACAAGQACACTTGGTACTACTTAQACGCTAAAGAAGGCGCCatQ caatacatcaaQgcta 
actetaaattcattqqtatcactaaa qqcqtcATGGTATCAAATGCCTrrATCCAGTCAGCGGACGGAACAGG 
CTGGTACTACCTCAAACCAGACGGAACACTGGCAGACAGGCCAGAA 

ATGgcGGCCAGATTTCCTTCAATTTTTACTGCAGTTTTATTCGCAGCATCCTCCGCATTAGCggccgctTACG 
TACATTCCGACGGCTCTTATCCAAAAGACAAGTTTGAGAAAATCAATGGCACTTGGTACTACTTTGACAGTTC 
AGGCTATATGCTTGCAGACCGCTGGAGGAAGCACACAGACGGCAACTGGTACTGGTTCGACAACTCAGGCGAA 
ATGGCTACAGGCTGGAAGAAAATCGCTGATAAGTGGTACTATTTCAACGAAGAAGGTGCCATGAAGACAGGCT 
GGGTCAAGTACAAGGACACTTGGTACTACTTAGACGCTAAAGAAGGCGCCatg caata catca agqctaactc 
taaqttcattqqtatcactqaa qqcqtcATGGTATCAAATGCCTTTATCCAGTCAGCGGACGGAACAGGCTGG 
TACTACCTCAAACCAGACGGAACACTGGCAGACAGGCCAGAAgctggtattacttacgttccaccattgttgt 

CCCGCTCCTAGGCTCAGCCAGTGACCACTGGCGTGGACGCTATGGCCGCCGCCGGCCCTTCATCTGGGCACTG 




TCCACTGGAGGCCCTGCTCTCTGACCTCTTCCGGGACCCGGACCACTGTCGCCAGGCCTACTCTGTCTATGCT 



CTACCTGGGCACCCAGGAGGAGTGCCTCTTTGGCCrGCTCACCCTCATCTTCCTCACCTGCGTAGCA^ 
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CTGCTGGTGGCTGAGGAGGCAGCGCTGGGCCCCACCGAGCCAGCAGAAGGGCTGTCGGCCCCCTCCTTGTCGC 
CCCACTGCTGTCCATGCCGGGCCCGCTTGGCTTTCCGGAACCTGGGCGCCCTGCTTCCCCGGCTGCACCAGCT 
GTGCTGCCGCATGCCCCGCACCCTGCGCCGGCTCTTCGTGGCTGAGCTGTGCAGCTGGATGGCACTCATGACC 
TTCACGCTGTTTTACACGGATTTCGTGGGCGAGGGGCTGTACCAGGGCGTGCCCAGAGCTGAGCCGGGCACCG 

GGTCTTCTCTCTGGTCATGGACCGGCTGGTGCAGCGATTCGGCACTCGAGCAGTCTATTTGGCCAGTGTGGCA 
GCTTTCCCTGTGGCTGCCGGTGCCAGATGCCTGTCCC^CAGTGTGGCCGTGGTGACAGCTTCAGCCGCCCTCA 

GTTCCTCCCCAAATACCGAGGGGACACTXMAGGTGCTAGC^GTGAGGACAGCCTGATGACCAGCTTC 

CCGCGCTCTGCGGGGCCTCTGCCTGTGAtGTCTCCGTACGTGTGGTGGTGGGTGAGCCCACCGAGGCCAGGGT 

TCCCTGTTTATGGGCTCCATTGTCCAGCTCAGCCAGTCTGTCACTGCCTATATGGTGTCTGCCGCAGGCCTGG 
GTCTGGTCGCCATTTACTTTGCTACACAGGTAGTATTTGACAAGAGCGACTTGGCCAAATACTCAGCGggtgg 
acaccatcaccatcaccattaa 

Construct 5 - Coding sequence of alphaprepro-PSOIsi-ss* HIS (in plasmid pRIT 15068 and 

yeast strain Y1790) 

Protein sequence (SEQ ID NO:3S) 



MSFLNFTAVL FAASSALAAP VNTTTEDETA 
GLLFINTTIA SIAAKEEGVS LEKREAEAMV 
PFIWALSLGI LLSLFLIPRA GWLAGLLCPD 
LSDLFRDPDH CRQAYSVYAF MISLGGCLGY 
FLTCVAATLL VAEEAALGPT EPAEGLSAPS 
MPRTLRRLFV AELCSWMALM TFTLFYTDFV 
GLFLQCAISIi VFSIiVMDRLV QRFGTRAVYL 
FTFSALQILP YTLASLiYHRE KQVFLPKYRG 
GAGGSGLLPP PPALCGASAC DVSVRVWGE 
SLFMGSIVQL SQSVTAYMVS AAGLGLVAIY 



QIPAEAVIGY 


SDLEGDFDVA 


VLPFSNSTNN 


60 


LGIGPVLGLV 


CVPLLGSASD 


HWRGRYGRRR 


120 


PRPLELALLI 


LGVGLLDFCG 


QVCFTPIiEAli 


180 


liLPAIDWDTS 


ALAPYLGTQE 


ECLFGLLTLI 


240 


LSPHCCPCRA 


RLAFRNLGAL 


LPRLHQLCCR 


300 


GEGLYQGVPR 


AEPGTEARRH 


YDEGVRMGSL 


360 


ASVAAFPVAA 


GATCLSHSVA 


WTASAALTG 


420 


DTGGASSEDS 


LMTSFLPGPK 


PGAPFPNGHV 


480 


PTEARWPGR 


GICLDLAILD 


SAFLLSQVAP 


540 


FATQWFDKS 


DLAKYSAGGH 


HHHHH 595 





Nucleotide sequence (SEQ ID NO:36) 



ATGAGTTTCC TCAATTTTAC TGCAGTTTTA TTCGCAGCAT CCTCCGCATT AGCTGCTCCA 60 

GTCAACACTA CAACAGAAGA TGAAACGGCA CAAATTCCGG CTGAAGCTGT CATCGGTTAC 120 

TCAGATTTAG AAGGGGATTT CGATGTTGCT GTTTTGCCAT TTTCCAACAG CACAAATAAC 180 

GGGTTATTGT TTATAAATAC TACTATTGCC AGCATTGCTG CTAAAGAAGA AGGGGTATCT 240 

CTCGAGAAAA GAGAGGCTGA AGCCatgGTG CTGGGCATTG GTCCAGTGCT GGGCCTGGTC 300 

TGTGTCCCGC TCCTAGGCTC AGCCAGTGAC CACTGGCGTG GACGCTATGG CCGCCGCCGG 360 
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CCCTTCATCT 


GGGCACTGTC 


CTTGGGCATC 


CTGCTGAGCC 


TCTTTCTCAT 


CCCAAGGGCC 




GGCTGGCTAG 


CAGGGCTGCT 


GTGCCCGGAT 


CCCAGGCCCC 


TGGAGCTGGC 


ACTGCTCATC 


480 


CTGGGCGTGG 


GGCTGCTGGA 


CTTCTGTGGC 


CAGGTGTGCT 


TCACTCCACT 


GGAGGCCCTG 


540 


CTCTCTGACC 


TCTTCCGGGA 


CCCGGACCAC 


TGTCGCCAGG 








ATGATCAGTC 




CCTGGGCTAC 


CTCCTGCCTG 


CCATTGACTG 


GGACACCAGT 




GCCCTGGCCC 


CCTACCTGGG 


CACCCAGGAG 




TTGGCCTGCT 


CACCCTCATC 






GCGTAGCAGC 


CACACTGCTG 


GTGGCTGAGG 


AGGCAGCGCT 


GGGCCCCACC 




GAGCCAGCAG 


AAGGGCTGTC 


GGCCCCCTCC 


TTGTCGCCCC 


ACTGCTGTCC 


ATGCCGGGCC 






TCCGGAACCT 


GGGCGCCCTG 


CTTCCCCGGC 


TGCACCAGCT 


GTGCTGCCGC 




ATGCCCCGCA 


CCCTGCGCCG 




GCTGAGCTGT 


GCAGCTGGAT 


GGCACTCATG 




ACCTTCACGC 




GGATTTCGTC 


GGCGAGGGGC 


TGTACCAGGG 


CGTGCCCAGA 




GCTGAGCCGG 


GCACCGAGGC 


CCGGAGACAC 


TATGATGAAG 


GCGTTCGGAT 


GGGCAGCCTG 




GGGCTGTTCC 


TGCAGTGCGC 


CATCTCCCTG 


GTCTTCTCTC 


TGGTCATGGA 


CCGGCTGGTG 


1140 


CAGCGATTCG 


GCACTCGAGC 


AGTCTATTTG 


GCCAGTGTGG 


CAGCTTTCCC 


TGTGGCTGCC 


1200 


GGTGCCACAT 


GCCTGTCCCA 


CAGTGTGGCC 


GTGGTGACAG 


CTTCAGCCGC 


CCTCACCGGG 


1260 


TTCACCTTCT 


CAGCCCTGCA 


GATCCTGCCC 


TACACACTGG 


CCTCCCTCTA 


CCACCGGGAG 


1320 


AAGCAGGTGT 


TCCTGCCCAA 


ATACCGAGGG 


GACACTGGAG 


GTGCTAGCAG 


TGAGGACAGC 


1380 


CTGATGACCA 


GCTTCCTGCC 


AGGCCCTAAG 


CCTGGAGCTC 


CCTTCCCTAA 


TGGACACGTG 


1440 


GGTGCTGGAG 


GCAGTGGCCT 


GCTCCCACCT 


CCACCCGCGC 


TCTGCGGGGC 


CTCTGCCTGT 


ISOO 


GAtGTCTCCG 


TACGTGTGGT 


GGTGGGTGAG 


CCCACCGAGG 


CCAGGGTGGT 


TCCGGGCCGG 


1560 


GGCATCTGCC 


TGGACCTCGC 


CATCCTGGAT 


AGTGCCTTCC 


TGCTGTCCCA 


GGTGGCCCCA 


1620 


TCCCTGTTTA 


TGGGCTCCAT 


TGTCCAGCTC 


AGCCAGTCTG 


TCACTGCCTA 


TATGGTGTCT 


1680 


GCCGCAGGCC 


TGGGTCTGGT 


CGCCATTTAC 


TTTGCTACAC 


AGGTAGTATT 


TGACAAGAGC 


1740 


GACTTGGCCA 


AATACTCAGC 


Gggtggacac 


catcaccatc 


accattaa 178B 
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FIG. 3. Structure of CPC-p501 His fusion protein expressed in S. cerevisiae 

o 



□ 



Clyta repeats 
P2 peptide 
P501 sequences 



n\oi 

2 3 
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FIG. 4. Primary structure of CPC-P501 His fusion protein (SEQ ID NO.41) 



MAAAYVHSDG 


SYPKDKFEKI 


NGTWYYFDSS 


GYMLADRWRK 


HTDGNWYWFD 


NSGEMATGWK 


60 


KIADKWYYFN 


EEGAMKTGWV 


KYKDTWYYLD 


AKEGAMQYIK 


ANSKFIGITE 


GVMVSNAFIQ 


120 


SADGTGWYYL 


KPDGTTADRP 


EKFMYMVLGI 


GPVLGLVCVP 


UjGSASDHWR 


GRYGRRRPFI 


180 


WALSLGILLS 


LFLIPRAGWL 


AGLLiCPDPRP 


LELALLILGV 


GLLDFCGQVC 


FTPLEALLSD 


240 


LFRDPDHCRQ 


AYSVYAFMIS 


LGGCLGYLLP 


AIDWDTSALA 


PYLGTQEECli 


FGLLTLIFLT 


300 


CVAATLLVAE 


EAALGPTEPA 


EGLSAPSLSP 


HCCPCRARIA 


FRNLGALLPR 


LHQLCCRMPR 


360 


TLRRLFVAEL 


CSWMALMTFT 


LFYTDFVGEG 


LYQGVPRAEP 


GTEARRHYDE 


GVRMGSXjGIjF 


420 


LQCAISLVFS 


LVMDRX.VQRF 


GTRAVYLASV 


AAFPVAAGAT 


CLSHSVAWT 


ASAALTGFTF 


480 


SALQILPYTL 


ASLYHREKQV 


FLPKYRGOTG 


GASSEDSLMT 


SFLPGPKPGA 


PFPNGHVGAG 


540 


GSGLLPPPPA 


LCGASACDVS 


VRVWGEPTE 


ARWPGRGIC 


LDLAILDSAF 


liLSQVAPSLF 


600 


MGSIVQLSQS 


VTAYMVSAAG 


LGLVAIYFAT 


QWFDKSDLA 


KYSAGGHHHH 


HH 652 
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FIG. 5. Nucleotide sequence of CPC P501 Kis(pRIT15201) (SEQ ID N0.42) 

ATGGCGGCCG CTTACGTACA TTCCGACGGC TCTTATCCAA AAGACAAGTT TGAGAAAATC 60 

AATGGCACTT GGTACTACTT TGACAGTTCA GGCTATATGC TTGCAGACCG CTGGAGGAAG 120 

CACACAGACG GCAACTGGTA CTGGTTCGAC AACTCAGGCG AAATGGCTAC AGGCTGGAAG 180 

AAAATCGCTG ATAAGTGGTA CTATTTCAAC GAAGAAGGTG CCATGAAGAC AGGCTGGGTC 240 

AAGTACAAGG ACACTTGGTA CTACTTAGAC GCTAAAGAAG GCGCCATGCA ATACATCAAG 300 

GCTAACTCTA AGTTCATTGG TATCACTGAA GGCGTCATGG TATCAAATGC CTTTATCCAG 360 

TCAGCGGACG GAACAGGCTG GTACTACCTC AAACCAGACG GAACACTGGC AGACAGGCCA 420 

GAAAAGTTCA TGTACATGGT G CTGGGC ATT GGTCCAGTGC TGGGCCTGGT CTGTGTCCCG 4 80 

CTCCTAGGCT CAGCCAGTGA CCACTGGCGT GGACGCTATG GCCGCCGCCG GCCCTTCATC 540 

TGGGCACTGT CCTTGGGCAT CCTGCTGAGC CTCTTTCTCA TCCCAAGGGC CGGCTGGCTA 600 

GCAGGGCTGC TGTGCCCGGA TCCCAGGCCC CTGGAGCTGG CACTGCTCAT CCTGGGCGTG 660 

GGGCTGCTGG ACTTCTGTGG CCAGGTGTGC TTCACTCCAC TGGAGGCCCT GCTCTCTGAC 720 

CTCTTCCGGG ACCCGGACCA CTGTCGCCAG GCCTACTCTG TCTATGCCTT CATGATCAGT 780 

CTTGGGGGCT GCCTGGGCTA CCTCCTGCCT GCCATTGACT GGGACACCAG TGCCCTGGCC 840 

CCCTACCTGG GCACCCAGGA GGAGTGCCTC TTTGGCCTGC TCACCCTCAT CTTCCTCACC 900 

TGCGTAGCAG CCACACTGCT GGTGGCTGAG GAGGCAGCGC TGGGCCCCAC CGAGCCAGCA 960 

GAAGGGCTGT CGGCCCCCTC CTTGTCGCCC CACTGCTGTC CATGCCGGGC CCGCTTGGCT 1020 

TTCCGGAACC TGGGCGCCCT GCTTCCCCGG CTGCACCAGC TGTGCTGCCG CATGCCCCGC 1080 

ACCCTGCGCC GGCTCTTCGT GGCTGAGCTG TGCAGCTGGA TGGCACTCAT GACCTTCACG 1140 

CTGTTTTACA CGGATTTCGT GGGCGAGGGG CTGTACCAGG GCGTGCCCAG AGCTGAGCCG 1200 

GGCACCGAGG CCCGGAGACA CTATGATGAA GGCGTTCGGA TGGGCAGCCT GGGGCTGTTC 1260 

CTGCAGTGCG CCATCTCCCT GGTCTTCTCT CTGGTCATGG ACCGGCTGGT GCAGCGATTC 1320 

GGCACTCGAG CAGTCTATTT GGCCAGTGTG GCAGCTTTCC CTGTGGCTGC CGGTGCCACA 1380 

TGCCTGTCCC ACAGTGTGGC CGTGGTGACA GCTTCAGCCG CCCTCACCGG GTTCACCTTC 1440 

TCAGCCCTGC AGATCCTGCC CTACACACTG GCCTCCCTCT ACCACCGGGA GAAGCAGGTG 1500 

TTCCTGCCCA AATACCGAGG GGACACTGGA GGTGCTAGCA GTGAGGACAG CCTGATGACC 1S60 

AGCTTCCTGC CAGGCCCTAA GCCTGGAGCT CCCTTCCCTA ATGGACACGT GGGTGCTGGA 1620 

GGCAGTGGCC TGCTCCCACC TCCACCCGCG CTCTGCGGGG CCTCTGCCTG TGATGTCTCC 1680 

GTACGTGTGG TGGTGGGTGA GCCCACCGAG GCCAGGGTGG TTCCGGGCCG GGGCATCTGC 1740 

CTGGACCTCG CCATCCTGGA TAGTGCCTTC CTGCTGTCCC AGGTGGCCCC ATCCCTGTTT 1800 

ATGGGCTCCA TTGTCCAGCT CAGCCAGTCT GTCACTGCCT ATATGGTGTC TGCCGCAGGC 1860 

CTGGGTCTGG TCGCCATTTA CTTTGCTACA CAGGTAGTAT TTGACAAGAG CGACTTGGCC 1920 
AAATACTCAG CGGGTGGACA CCATCACCAT CACCATTAA 1959 
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FIG. 6. Cloning strategy for generation of plasmid pRIT 15201 
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FIG. 7. Plasmid map of pRlT15201 



pCUPl 




HIS 



LEU2 



110 



EP1 511 768 B1 



FIG. 8. Comparative expression of CPC P501 and PS01 in S.cerevisiae strain DC5 (gel 
Laemmli 10%) 



1234567 1234567 




Silver staining Western blot anti P501 

(Monoclonal antibody) 



1 MW Biolabs (175/83/62/47.5/32.5/16.5 Kda) 

2 Y1796 purified 

3 Y 1 795 Crude Extract ( negative control) 

4 SC333 Crude Extract 

5 Y1796 Crude Extract 

6 Y1790 Crude Extract 

7 Y1802 Crude Extract 
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FIG. 9A. 





X Z 3 4 S < 7 8 * 10 11 12 1J 



1 - Molecular Weight Marker ( Blolabs - Grow Range)175; 83; 62; 47.5; 32.5; 25; 16.5; 6.5 kD - 

2 - Purified Reference CP2CP501S/I2 135 ng 

3 - Purified Reference CP2CP50IS/12 67.8 ng 

4 - Purified Reference CP2CP501S/12 33.9 ng 

5 - Purified Reference CP2CP501S/12 16.9 ng 

6 - Fermentation PR01 19-21 h30 

7 - Fermentation PRO124-21h30 

8 - Fermentation PRO124-22h30 

9 - Fermentation PRO! 27-0 h 

10 - Fermentation PRO 127-4 h 

1 1 - Fermentation PR0127-6 h 

12 - Fermentation PROf 27-22h20 

13 - Fermentation PR0127-22h45 
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FIG. 10. Purification scheme of CPC-P501-Hls produced by Y1796. 



j S. Cerevisiae cells \ j 


: 4- ■ i 


Dyno-mitl disruption 


\ OD 120 / 2 passes / 20 mM Tris pH 8.5 - 5 mM EDTA ■ 


S * i : 


Centrifugation 


; 1 2.000 g / RT / 90 min {supernatant discarded) j 


: * i i 


Pellet washing step 1 


: 20 mM Tris pH 8.5 - 0.15 M NaCI - 2.0 M Guanidine.HCI - i 




: 0. 1 % Empigen (30 min / RT) • 


i * i S 


Centrifugation 


• 1 2.000 g / RT / 60 min (supernatant discarded) • 


i * i i 


Pellet washing step 2 


j 20 mM Tris pH 8.5 - 0.15 M NaCI - 4.0 M Urea j 


i * i : 


Centrifugation 


• 1 2.000 g / RT / 30 min (supernatant discarded) 


i * i : 


: Solubilisation / Reduction 


i 20 mM Tris pH 8.5 - 0.15 M NaCI - 8.0 M Urea - 1% SDS - : 




• 0.2 M Glutathion (60 min / RT) < 


• * i : 


i Centrifugation 


j 12.000 g / RT / 30 min (pellet discarded) j 



Carbamidomethylation i 0.3 M lodoacetamide (30 min / RT / in the dark) / pH 

; adjusted to 8.5 (with 5 M NaOH solution) before incubation 



R/C Supernatant 



10-fold dilution and j Dilution buffer 20 mM Tris pH 8.5 - 1 M NaCI - 8.0 M Urea 

pH adjustment (8.5) 



i Immobilised metal ion affinity 


: Eauilibration buffer: 20 mM Tris dH 8.5 - 0.9 M NaCI - 8.0 M i 


chromatography on 


j Urea -0.1% SOS j 


Ni~-Chelating Sepharose FF 


i Washinq buffers: I 


i (Amersham) 


: 1 ) Equilibration buffer j 


j (1 0x25 cm column - 2000 mi) 


: 2) 20 mM Tris pH 8.5 - 0.15 M NaCI - 8.0 M Urea - 0.1% : 




;sds ; 




i 3) 20 mM Tris pH 8.5 - 8.0 M Urea - 0.1% Tween 80 ! 
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! Elution buffer : 20 mM Tris pH 8.5 - 8.0 M Urea - 0.1% Tween i 
i 80 - 0.5 M Imidazole j 

*ZIZZIZZZZZZZZZZIIIIZIZZZ~III"j 

2-f old dilution and j 20 mM Piperazine pH 10.6 - 8.0 M Urea - 0. 1 % Tween 80 : 
pH adjustment (10.0) j j 



Anion exchange 


: Eauilibration buffer: 20 mM Piperazine pH 10.0 - 8.0 M Urea 


chromatography on Q 


i -0.1% Tween 80 


Sepharose FF 


j Washinq buffers: 


(Amersham) 


; 1) Equilibration buffer 


{2,6 x 6.5 cm column - 35 ml) 


i 2} 20 mM Tris pH 8.5 - 8.0 M Urea - 0.1% Tween 80 




I Elution buffer: 20 mM Tris pH 7.5 - 8.0 M Urea - 0.1% 




j Tween 80 - 0.5 M NaCI 


* i ' 


Concentration/DiaRltration 


j +/- 3-fo!d concentration 


(Pall - Omega 10 kDa - 200 cm 2 ) 


I Diafiltration buffer: Tris 20 mM pH 7.5 


* I 


Sterile filtration 




(Millipore - Miliex GV 0.22pm) 




* i 


Purified bulk 


• Final buffer: 20 mM Tris pH 7.5 - +/- 0.3% Tween 80 




Storage -20°C 
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FIG. 11. Pattern of CPC P501 His purified protein (4-12% Novex Nu-Page polyacrylamide 
precasted gels) 



2 3 4 5 6 7 




2 3 4 5 6 7 




Coomassie Blue R250 



Daiichi Silver Staining 




1: MW (250/150/75/50/37/25/15/10 kDa) 
2: Purified bulk A (reducing conditions) 
3: Purified bulk B (reducing conditions) 
4: Purified bulk C (reducing conditions) 
5: Purified bulk A (non reducing conditions) 
6: Purified bulk B (non reducing conditions) 
7; Purified bulk C (non reducing conditions) 



Western Blot anti P501S 
(Monoclonal antibody) 
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FIG. 12. Native full-length P501S sequence (SEQ ID NO:17 & 43} 

Nucleotide sequence: SEQ ID NO.17 
Polypeptide sequence: SEQ ID N0.43 
###### 

GCCACCATGGTCCAGAGGCTGTGGGTGAGCCGCCTGCTGCGGCACCGG 
MVQRLWVSRLLRHR 



PDPRPLELALIilLGVGLLDP 



CGQVCFTPLEALIiSDtiFRDP 154 

GACCACTGTCGCCAGGCCTACTCTGTCTATGCCTTCATGATCAGTCTTGGGGGCTGCCTG 
DHCRQAYSVYAFMI S L G G C L 174 



GYL.LPAIDWDTSALAPYLGT 



GCCCTGCTTCCCCGGCTGCACCAGCTGTGCTGCCGCATGCCCCGCACCCTGCGCCGGCTC 
ALLPRLHQLCCRMPRTLRRL : 



TTCGTGGGCGAGGGGCTGTACCAGGGCGTGCCCAGAGCTGAGCCGGGCACCGAGGCCCGG 
FVGEGLYQGVPRAEPGTEAR 314 

AGACACTATGATGAAGGCGTTCGGATGGGCAGCCTGGGGCTGTTCCTGCAGTGCGCCATC 
RHYDEGVRMGSLGLFLQCAI 334 



LVFSLVMDRLVQRFGTRAV 
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TATTTGGCCAGTGTGGCAGCTTTCCCTGTGGCTGCCGGTGCCACATGCCTGTCCCACAGT 
YLASVAAFPVAAGATCLSHS 374 

GTGGCCGTGGTGACAGCTTCAGCCGCCCTCACCGGGTTCACCTTCTCAGCCCTGCAGATC 
VAVVTASAAI.TGFTFSALQI 394 

CTGCCCTACACACTGGCCTCCCTCTACCACCGGGAGAAGCAGGTGTTCCTGCCCAAATAC 
LPYTLASLYHREKQVFLPKY 4X4 

CGAGGGGACACTGGAGGTGCTAGCAGTGAGGACAGCCTGATGACCAGCTTCCTGCCAGGC 
RGDTGGASSEDSLMTSFLPG 434 



PKPGAPFPNGHVGAGGSGLL 454 



PPPPALCG ASACDVSVRVVV 474 

GGTGAGCCCACCGAGGCCAGGGTGGTTCCGGGCCGGGGCATCTGCCTGGACCTCGCCATC 
GEPTEARVVPGRGICLDLAI 494 

CTGGATAGTGCCTTCCTGCTGTCCCAGGTGGCCCCATCCCTGTTTATGGGCTCCATTGTC 
LDSAFLtiSQVAPS LFMGS IV 5X4 

CAGCTCAGCCAGTCTGTCACTGCCTATATGGTGTCTGCCGCAGGCCTGGGTCTGGTCGCC 
QLSQSVTAYMVSAAGLGLVA 534 

ATTTACTTTGCTACACAGGTAGTATTTGACAAGAGCGACTTGGCCAAATACTCAGCGTAG 
IYFATQVVFDKSDI»AKYSA* 554 
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FIG. 13. Sequence of the CPC-P501S expression cassette of JNW735 (SEQ ID N0:18 & 44) 

Nucleotide sequence: SEQ ID NO.18 
Polypeptide sequence: SEQ ID N0.44 



###### 

GCCACCATGGCGGCCGCTTACGTACATTCCGACGGCTCTTATCCAAAA 

MAAAYVHSDGSYPK 14 

GACAAGTTTGAGAAAATCAATGGCACTTGGTACTACTTTGACAGTTCAGGCTATATGCTT 
DKFEKINGTWYYFDSSGYML 34 

GCAGACCGCTGGAGGAAGCACACAGACGGCAACTGGTACTGGTTCGACAACTCAGGCGAA 
ADRWRKHTDGHWYWFDNSGE 54 

ATGGCTACAGGCTGGAAGAAAATCGCTGATAAGTGGTACTArTTCAACGAAGAAGGTGCC 
MATGWKKIADKWYYFHEEGA 74 

ATGAAGACAGGCTGGGTCAAGTACAAGGACACTTGGTACTACTTAGACGCTAAAGAAGGC 
MKTGWVKYKOTWYYLDAKEG 94 

GCCATG CAATACATCAAGGCTAACTCTAAGTTCATTGGTATCACTGA AGGCGTCATGGTA 
AM|QYIKANSKFIGI T lj G V M V 114 

TCAAATGCCTTTATCCAGTCAGCGGACGGAACAGGCTGGTACTACCTCAAACCAGACGGA 
SNAFIQSADGTGWYYLKPDG 134 

ACACTGGCAGACAGGCCAGAAAAGTTCATGTACATGGTGCTGGGCATTGGTCCAGTGCTG 

T L A D R P E KFMYMVLGIGPVL 154 

GGCCTGGTCTGTGTCCCGCTCCTAGGCTCAGCCMTGACCACTX3GCGTGGACGCTATGGC 
GLVCVPLLGSASDHWRGRYG 174 

CGCCGCCGGCCCTTCATCTGGGCACTGTCCTTGGGCATCCTGCTGAGCCTCTTTCTCATC 
RRRPFIWALSLGILLSLFLI 194 

CCAAGGGCCGGCTGGCTAGCAGGGCTGCTGTGCCCGGATCCCAGGCCCCTGGAGCTGGCA 
PRAGWLAGLLCPDPRPLELA 214 



CTGCTCATCCTGGGCGTGGGGCTGCTGGACTTCTGTGGCCAGGTGTGCTTCACTCCACTG 
LLILGVGLLDFCGQVCFTPt, 234 

GAGGCCCTGCTCTCTGACCTCTTCCGGGACCCGGACCACTGTCGCCAGGCCTACTCTGTC 
EALLSDLFRDPDHCRQAYSV 254 



TATGCCTTCATGATCAGTCTTGGGGGCTGCCTGGGCTACCTCCTGCCTGCCATTGACTGG 
YAFMISLGGCLGYLLPAIDW 274 



GACACCAGTGCCCTGGCCCCCTACCTGGGCACCCAGGAGGAGTGCCTCTTTGGCCTGCTC 
DT8AI.APyi.GTQBBCZ.FQLL 294 

ACCCTCATCTTCCTCACCTGCGTAGCAGCCACACTGCTGGTGGCTGAGGAGGCAGCGCTG 
TLIFLTCVAATLLVAEEAAL 314 
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TGCCGGGCCCGCTTGGCTTTCCGGAACCTGGGCGCCCTGCTTCCCCGGCTGCACCAGCTG 
CRARLAFRNLGALLPRIiHQI, 354 



CCRMPRTIiRRIiFVAEIiCSWM 374 

GCACTCATGACCTTCACGCTGTTTTACACGGATTTCGTGGGCGAGGGGCTGTACCAGGGC 
ALMTFTLFYTDFVGEGLYQG 394 

GTGCCCAGAGCTGAGCCGGGCACCGAGGCCCGGAGACACTATGATGAAGGCGTTCGGATG 
VPRAEPGTEARRHYDEGVRM 414 

GGCAGCCTGGGGCTGTTCCTGCAGTGCGCCATCTCCCTGGTCTTCTCTCTGGTCATGGAC 
GSIiGLFLQCAISIiVFSliVMD 434 

CGGCTGGTGCAGCGATTCGGCACTCGAGCAGTCTATTTGGCCAGTGTGGCAGCTTTCCCT 
RIiVQRFGTRAVYLASVAAF P 454 

GTGGCTGCCGGTGCCACATGCCTGTCCCACAGTGTGGCCGTGGTGACAGCTTCAGCCGCC 
VAAGATCLSHSVAVVTASAA 474 



LTGFTFSALQILPYTLASLY 494 



HREKQVFLPKYRGDTGGASS 514 

GAGGACAGCCTGATGACCAGCTTCCTGCCAGGCCCTAAGCCTGGAGCTCCCTTCCCTAAT 
EDSLMTSFLPGPKPGAPFPN 534 



GHVGAGGSGLLPPPPAliCGA 554 

TCTGCCTGTGATGTCTCCGTACGTGTGGTGGTGGGTGAGCCCACCGAGGCCAGGGTGGTT 
SACDVSVRVVVGEPTEARVV 574 

CCGGGCCGGGGCATCTGCCTGGACCTCGCCATCCTGGATAGTGCCTTCCTGCTGTCCCAG 
PGRGICLDLAILDSAFLLSQ 594 

GTGGCCCCATCCCTGTTTATGGGCTCCATTGTCCAGCTCAGCCAGTCTGTCACTGCCTAT 
VAPSLFMGS IVQLSQSVTAY 614 

ATGGTGTCTGCCGCAGGCCTGGGTCTGGTCGCCATTTACTTTGCTACACAGGTAGTATTT 
MVSAAGLGLVAIYFATQVVF 634 

GACAAGAGCGACTTGGCCAAATACTCAGCGTAGGTCGAG 

DKSDLAKYSA* 645 
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FIG. 14 - Codon optimised P501S sequences (SEQ ID NO:19-20) 
SEQ ID NO:19 

ATGGTGCAGCGGCTCTGGQTGAGCCGCCTCCTGCGGCATCGCAAGGCCCAGCTCCTGCTGGTGAATCTGCTCA 
CATTCGGCCTGGAGGTGTGCCTGGCCGCCGGCATCACCTACGTGCCCCCCCTCCTGCTGGAGGTGGGAGTCGA 
GGAGAAGTTCATGACCATGGTGCTGGGCATTGGGCCCGTCCTGGGCCTCGTGTGCGTGCCTCTCCTCGGCAGC 
GCTTCCGACCyVTTGGCGCGGCCGGTATGGCCGCAGGAGACCCTTCATCTGGGCTCTGAGTCTCGGCATCCTGC 



GGCCCTGCTGATCCTCGGCGTGGGCCTGCTGGACTTCTGCGGCCAGGTGTGCTTCACGCCCCTGGAGGCACTG 
CTGAGCGACCTGTTCCGGGACCCCGACCATTGCCGCCAGGCGTACAGCGTGTACGCCTTCATGATCTCCCTGG 
GAGGCTGCCTGGGCTACCTGCTCCCCGCCATCGA1TGGGACACCAGCGCACTCGCCCCCTATCTCGGAACACA 

GAGGCCGCCCTGGGGCCCACCGAGCCGGCCGAGGGACTGAGCGCCCCGAGCCTGAGTCCACACTGCTGCCCTT 

5ATGGCTCTCATGACCTTCACCCTGTTTTAT 



CATGGACAGGCTGGTGCAGCGCTTCGGAACCCGGGCGGTGTACCTGGCGAGCGTGGCCGCCTTCCCCGTGGCT 
GCCGGCGCCACCTGCCTCTCTCACTCGGTGGCCGTGGTCACCGCCAGCGCCGCCCTGACCGGGTTCACCTTCT 



GCCCCTTTCCCCAACGGGCACGTGGGCGCCGGCGGGAGTGGGCTCCTGCCCCCCCCTCCTGCGCTGTGCGGGG 
CCAGCGCCTGCGACGTGAGCGTGCGCGTGGTGGTGGGCGAGCCCACCGAGGCCCGCGTGGTGCCGGGCAGAGG 



TCTATCGTCCAGCTGTCTCAGAGCGTCACCGCTTACATGGTGTCCGCTGCTGGACTGGGCTTGGTGGCTATTT 
ATTTCGCCACCCAGGTGGTGTTCGACAAGAGCGACCTGGCCAAATACTCCGCCTGA 



atggtgcagcggctgtgggtgtcccggctgctgcgccatagaaaggcccagttgctgctggtgaacctgctga 
ctttcggactggaggtgtgcctggctgccgggatcacgtacgtgccccccctgctgctggaggtgggcgtgga 

gcgtccgatcattggcggggccgctacggccgccgcagaccgttcatctgggccctgagcctggggatcctgc 
tctctctcttcctgatcccccgggccggctggctggccggcctgctgtgtccx:gacccccgccctctggagct 
ggccctcctgatcctgggcgtgggcttgttggacttctgcggccaggtgtgtttcactcccctggaggctctg 
ctctccgacctcttccgcgaccccgaccactgtaggcaggcttacagcgtgtacgccttcatgatcagtctgg 
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GGGGATGCCTGGGCTATCTGCTGCCCGCTATCGACTGGGACACCAGCGCCCTGGCCCCCTACCTGGGGACTCA 
GGAGGAGTGCCTGTTCGGCCTGCTCACCTTGATCTTCCTGACGTGCGTCGCCGCCACCCTGCTGGTGGCCGAG 
GAGGCGGCCCTGGGGCCCACCGAGCCCGCCGAGGGCCTGAGCGCTCCCAGCCTGAGCCCCCATTGCTGCCCGT 
GCAGGGCTAGGCTCXICCTTCAGGAATCTGGGCGCTTTGCTGCCCCGCCTGCATCAGCTGTGCTGTCGCATGCC 

ACCGACTTCGTGGGGGAGGGCCTGTACCAGGGCGTGCCCAGGGCCGAGCCCGGCACCGAGGCTAGGCGCCATT 

GATGGACCGGCTGGTGCAGCGCTTCGGCACCCGGGCCGTGTACCTCGCCTCTGTGGCGGCTTTCCCCGTCGCC 
GCCGGCGCGACCTGCCTGTCTCATTCTGTCGCCGTGGTGACCGCCAGCGCCGCCCTGACCGGCTTCACCTTCA 
GTGCGCTCCAGATTCTGCCCTACACCCTGGCGTCTCTGTACCATCGCGAGAAGCAGGTGTTCCTGCCCAAGTA 
CCGCGGGGACACAGGGGGAGCTTCCTCTGAGGACAGCCTGATGACCAGCTTCTTGCCCGGCCCCAAGCCGGGG 
GCCCCTTTC( 

GATCTGCCTGGACCTGGCCATCCTCGACTCCGCCTTCCTGCTCTCCCAGGTGGCGCCCAGCCTGTTCATGGGC 
AGTATCGTGCAGCTGAGCCAGAGCGTGACCGCCTACATGGTGAGCGCCGCCGGCCTGGGGTTGGTGGCCATCT 
ACTTTGCCACCCAGGTCGTGTTCGACAAGAGCGATCTCGCCAAGTATAGCGCCTGA 
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FIG, 15 - Re-engineered codon optimised sequence 19 (SEQ ID NO:21) 




GCCCCTGGAGGC^CTGCTGAGCGACCTGTTCOSGGACCCCGACCATTGCCGCCAGGCGTACAGCGTGTACGCC 
TTCATGATCTCCCTGGGAGGCTGCCTGGGCTACCTGCTCCCCGCCATCGATTGGGACACCAGCGCACTCGCCC 
CCTATCTCGGAACACAGGAGGAATGCCTGTTCGGA@T<^§rGACGCTCATCTTCCTCACGTGCGTCGCGGCCAC 
CCTGTTGGTGGCCGAGGAGGCCGCCCTGGGGCCCACCGAGCCGGCCGAGGGACTGAGCGCCCCGAGCCTGAGT 
CCACACTGCTGCCCTTGCCGGGCCCGCCTGGCCTTCeXSTAATCTGGGCGCCCTCCTGCCTCGGCTCCATCAGC 
TCTGTTGCAGAATGCCTAGGACGCTGCGGCGCCTGTTCGTCGCTGAGTTGTGCTCCTGGATGGCTCTCATGAC 
CTTCACCCK3TTTTATACGGACTTCGTCGGGGAGGGCCTGTACCAGGGGGTGCCGCGCGCCGAGCCCGGGACA 
GAGGCGCGCCGCCACTACGACGAGGGAGTGCGTATGGGCTCCCTGGGCCTCTTCTTGCAGTGCGCCATCAGTC 
TGGTTTTCTCTCTGGTCATGGACAGGCTGGTGCAGCGCTTCGGAACCCGGGCGGTGTACCTGGCGAGCGTGGC 
CGCCTTCCCCGTGGCTGCCGGCGCCACCTGCCTCTCTCACTCGGTGGCCGTGGTCACCGCCAGCGCCGCCCTG 
ACCGGGTTCACCTTCTCTGCCCTGCAGATTCTGCCTTACACCCTGGCCAGCCTGTACCATCGCGAGAAACAGG 
TGTTTCTCCCCAAGTACAGAGGCGACACCGGGGGCGCCTCCAGCGAGGACAGCCTCATGACCTCCTTCCTGCC 
TGGCCCCAAGCCCGGCGCCCCTTTCCCCAACGGGCACGTGGGCGCCGGCGGGAGTGGGCTCCTGCCCCCCCCT 
CCTGCGCTGTGCGGGGCCAGCGCCTGCGACGTGAGCGTGCGCGTGGTGGTGGGCGAGCCCACCGAGGCCCGCG 

GTCCCTCTTCATGGGCTCTATCGTCCAGCTGTCTCAGAGCGTCACCGCTTACATGGTGTCCGCTGCTGGACTG 

TCGAGGCAG 
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FIG. 16 - Re-engineered codon optimised sequence 20 (SEQ ID NO:22) 



GACG GCTAGC GCCACCATGGTGCAGCGGCTGTGGGTGTCCCGGCTGCTGCGCCATAGAAAGGCCCAQTTGCTG 
CTGGTGAACCTGCTGACTTTCGGACTGGAGGTGTGCCTGGCTGCCGGGATCACGTACGTGCCCCCCCTGCTGC 

GCCCCTCCTCGGGAGTGCGTCCGATCATTGGCGGGGCCGCTACGGCCGCCGCAGACCGTTCATCTGGGCCCTG 
AGCCTGGGCATCCTGCTCTCTCTCTTCCTGATCCCCCGGGCCGGCTGGCTGGCCGGCCTGCTGTGTCCCGACC 

TCCCCTCGAGGCTCrGCTCTCCGACCTCTTCCGCGACCCCGACCACTGTAGGCAGGCTTACAGCGTGTACGCC 
TTCATGATCAGTCTGGGGGGATGCCTGGGCTATCTGCTGCCCGCTATCGACTGGGACACCAGCGCCCTGGCCC 
CCTACCTGGGGACTCAGGAGGAGTGCCTGTTCGGCCTGCTCACCTTGATCTTCCTGACGTGCGTCGCCGCCAC 
CCTGCTGGTGGCCGAGGAGGCGGCCCTGGGGCCCACCGAGCCCGCCGAGGGCCTGAGCGCTCCCAGCCTGAGC 
CCCCATTGCTGCCCGTGCAGGGCTAGGCTCGCCTTCAGGAATCTGGGCGCTTTGCTGCCCCGCCTGCATCAGC 



GTTCACCCTCTTCTACACCGACTTCGTGGGGGAGGGCCTGTACCAGGGCGTGCCCAGGGCCGAGCCCGGCACC 
GAGGCTAGGCGCCATTACGACGAGGGCGTCAGGATGGGCTCTCTGGGCCTCTTCCTGCAGTGCGCCATCAGTC 
TGGTGTTCTCTCTGGTGATGGACCGGCTGGTGCAGCGCTTCGGCACCCGGGCCGTGTACCTCGCCTCTGTGGC 
GGCTTTCCCCGTCGCCGCCGGCGC^ACCTGCCTGTCTCATTCTGTCGCCGTGGTGACCGCCAGCGCCGCCCTG 
ACCGGCTTCACCTTCAGTGCGCTCCAGATTCTGCCCTACACCCTGGCGTCTCTGTACCATCGCGAGAAGCAGG 
TGTTCCTGCCCAAGTACCGCGGGGACACAGGGGGAGCTTCCTCTGAGGACAGCCTGATGACCAGCTTCTTGCC 

CCCGCCCTGTGCGGCGCTAGTGCCTGCGACGTGAGCX3TGCGGGTGGTGGTGGGGGAGCCCACCGAGGCTAGGG 

CAGCCTGTTCATGGGCAGTATCGTGCAGCTGAGCCAGAGCGTGACCGCCTACATGGTGAGCGCCGCCGGCCTG 
GGGTTGGTGGCCATCTACTTTGCCACCCAGGTCGTGTTCGACAAGAGCGATCTCGCCAAGTATAGCGCCTGAC 
TCGAGGCAG 
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FIG. 17 - The starting sequence for the optimisation of CPC (SEQ ID NO:23) 
Four amino acids of P501S sequence are boxed. 

ATGGCGGCCOCTTACGTACATTCCGACGGCTCTTATCCAAAAGACAAGTTTGAGAAAATCAATGGCACTTGGT 
ACTACTTTGACAGTTCAGGCTATATGCTTGCAGACCGCTGGAGGAAGCACACAGACGGCAACTGGTACTGGTT 
CGACAACTCAGGCGAAATGGCTACAGGCTGGAAGAAAATCGCTGATAAGTGGTACTATTTCAACGAAGAAGGT 
GCCATGAAGACAGGCTGGGTCAAGTACAAGGACACrTGGTACTACTTAGACGCTAAAGAAGGCGCCATGCAAT 
ACATCAAGGCTAACTCTAAGTTCATTGGTATCACTGAAGGCGTCATGGTATCAAATGCCTTTATCCAGTCAGC 
GGACGGAACAGGCTGGTACTACCTCAAACCAGACGGAACACTGGCAGACAGGCCAGAA jAAGTTCATGTAC| 

FIG. 18 - Representative codon optimised CPC sequences (SEQ ID NO:24-25) 
SEQ ID NO:24 

ATGGCCGCCGCCTACGTGCATAGCGACGGGAGCTACCCCAAGGACAAGTTCGAGAAGATCAACGGGACATGGT 
ACTACTTCGACTCCTCCGGCTACATGCTCGCCGACCGCTGGCGGAAGCACACCGACGGCAACTGGTACTGGTT 
CGATAACTCGGGAGAGATGGCCACCGGCTGGAAGAAGATCGCGGACAAGTGGTACTATTTCAACGAGGAGGGC 

ATATCAAGGCCAACAGCAAGTTCATCGGCATCACCGAGGGAGTGATGGTCAGCAACGCCTTTATCCAGAGCGC 
CGACGGCACCGGATGGTACTACTTGAAGCCGGACGGCACCCTCGCGGATCGGCCCGAGAAGTTCATGTAC 

SEQ ID NO:25 

ATGGCCGCCGCCTACGTGCACAGCGACGGGTCCTACCCAAAGGACAAGTTCGAGAAGATCAACGGCACGTGGT 
ACTATTTCGACAGCAGCGGCTACATGCTCGCCGATCGCTGGCGCAAGCACACCGACGGGAACTGGTACTGGTT 
CGACAACTCTGGCGAGATGGCTACGGGGTGGAAGAAGATCGCCGACAAGTGGTACTACTTCAACGAGGAGGGC 
GCCATGAAGACCGGGTGGGTGAAGTACAAGGACACCTGGTACTACCTGGACGCTAAGGAGGGCGCCATGCAGT 
ACATCAAGGCCAACTCGAAGTTCATCGGGATCACCGAGGGCGTGATGGTCAGTAACGCTTTCATCCAGAGCGC 
GGACGGCACAGGCTGGTATTACCTGAAGCCCGATGGCACCCTGGCGGACAGACCTGAGAAATTCATGTAC 

FIG. 19 - Engineered CPC codon optimised sequence (SEQ ID NO:26) 
SEQ ID NO:26 

GACG GCTAGC GCCACCATGGCCGCCGCCTACGTGCATAGCGACGGGAGCTACCCCAAGGACAAGTTCGAGAAG 
ATCAACGGGACATGGTACTACTTCGACTCCTCCGGCTACATGCTCGCCGACCGCTGGCGGAAGCACACCGACG 
GCAACTGGTACTGGTTCGATAACTCGGGAGAGATGGCCACCGGCTGGAAGAAGATCGCGGACAAGTGGTACTA 
TTTCAACGAGGAGGGCGCCATGAAGACCGGCTGGGTGAAGTATAAGGACACCTGGTACTACCTCGACGCCAAG 
GAGGGCGCCATGCAGTATATCAAGGCCAACAGCAAGTTCATCGGCATCACCGAGGGAGTGATGGTCAGCAACG 
CCTTTATCCAGAGCGCCGACGGCACCGGATGGTACTACTTGAAGCCGGACGGCACCCTCGCGGATCGGCCCGA 
C ^GTTCATGTA^ TGACTGGAGGCAG 
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FIG. 20 - P501S CPC fusion candidate constructs and sequences 
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Construct A ■ SEQ ID NO:37 (nucleotide) & 45 (polypeptide) 

GCGGCCGCGCCACCATGGCCGCCGCCTACGTGCATAGCGACGGGAGCTACCCCAAGGACA 
MAAAYVHSDGSYPKDK 



EKINGTWYYFD 



S G Y M L A D 



ACCGCTGGCGGAAGCACACCGACGGCAACTGGTACTGGTTCGATAACTCGGGAGAGATGG 
RWRKHTDGNWYWFDNSGEMA 

CCACCGGCTGGAAGAAGATCGCGGACAAGTGGTACTATTTCAACGAGGAGGGCGCCATGA 
TGWKKIADKWYY P N E E G A M K 

AGACCGGCTGGGTGAAGTATAAGGACACCTGGTACTACCTCGACGCCAAGGAGGGCGCCA 
TGWVKYKDTWYYLDAKEGAM 

TGCAGTATATCAAGGCCAACAGCAAGTTCATCGGCATCACCGAGGGAGTGATGGTCAGCA 
QYIKANSKFIGITEGVMVSN 

ACGCCTTTATCCAGAGCGCCGACGGCACCGGATGGTACTACTTGAAGCCGGACGGCACCC 
AFIQSADGTGWYYLKPDGTL 

TCGCGGATCGGCCCGAGAAGTTCATGTACATGGTGCTGGGCATCGGCCCCGTCCTGGGCC 
ADRPEKFMYMVLGIGPVLGL 

TCGTGTGTGTGCCCCTCCTCGGGAGTGCGTCCGATCATTGGCGGGGCCGCTACGGCCGCC 
VCVPLLGSASDHWRGRYGRR 

GCAGACCGTTCATCTGGGCCCTGAGCCTGGGCATCCTGCTCTCTCTCTTCCTGATCCCCC 
RPFIWAI»SI»GILLSI)FI>IPR 

AGWLAGLLCPDPRPLEliALL 
TGATCCTGGGCGTGGGCCTGCTGGACTTCTGCGGCCAGGTGTGTTTCACTCCCCTGGAGG 
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ILGVGLLDFCGQVCFTPIiEA 

CTCTGCTCTCCGACCTCTTCCGCGACCCCGACCACTGTAGGCAGGCTTACAGCGTGTACG 
LLSDLFRDPDHCRQAYSVYA 



CCAGCGCCCTGGCCCCCTACCTGGGGACTCAGGAGGAGTGCCTGTTCGGCCTGCTCACCT 
SALAPYLGTQEECLFGLLTL 

TGATCTTCCTGACGTGCGTCGCCGCCACCCTGCTGGTGGCCGAGGAGGCGGCCCTGGGGC 
I F LTCVAAT LLVAEEAALG P 



GGGCTAGGCTCGCCTTCAGGAATCTGGGCGCTTTGCTGCCCCGCCTGCATCAGCTGTGCT 
ARLAFRNLGALLPRLHQLCC 



TGATGACGTTCACCCTCTTCTACACCGACTTCGTGGGGGAGGGCCTGTACCAGGGCGTGC 
MTFTLFYTDFVGEGLYGGVP 

CCAGGGCCGAGCCCGGCACCGAGGCTAGGCGCCATTACGACGAGGGCGTCAGGATGGGCT 
RAEPGTEARRHYDEGVRMGS 



TGGTGCAGCGCTTCGGCACCCGGGCCGTGTACCTCGCCTCTGTGGCGGCTTTCCCCGTCG 
VQRFGTRAVY LASVAAFPVA 



AGATCLSHSVAVVTASAALT 

CCGGCTTCACCTTGAGTGCGCTCCAGATTCTGCCCTACACCCTGGCGTCTCTGTACCATC 
GFTFSALQILPYTLASLYHR 

GCGAGAAGCAGGTGTTCCTGCCCAAGTACCGCGGGGACACAGGGGGAGCTTCCTCTGAGG 
EKQVFLPKYRGDTGGASSED 

ACAGCCTGATGACCAGCTTCTTGCCCGGCCCCAAGCCGGGGGCCCCTTTCCCCAACGGCC 
SLMTS FLPGPKPGAPFPNGH 

ATGTCGGGGCGGGCGGCAGCGGCCTGCTCCCTCCCCCCCCCGCCCTGTGCGGCGCTAGTG 
VGAGGSGLLPPPPALCGASA 

CCTGCGACGTGAGCGTGCGGGTGGTGGTGGGGGAGCCCACCGAGGCTAGGGTCGTGCCTG 
CDVSVRVVVGEPTEARVVPG 

GCCGGGGGATCTGCCTGGACCTGGCCATCCTCGACTCCGCCTTCCTGCTCTCCCAGGTGG 
RGICLDLAILDSAFLLSQVA 

CGCCCAGCCTGTTCATGGGCAGTATCGTGCAGCTGAGCCAGAGCGTGACCGCCTACATGG 
PS LFMGS IVQLSQSVTAYMV 
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TGAGCGCCGCCGGCCTGGGGTTGGTGGCCATCTACTTTGCCACCCAGGTCGTGTTCGACA 
SAAGLGLVAIYFATQVVFDK 

AGAGCGATCTCGCCAAGTATAGCGCCTGAGGATCC 
SDLAKYSA* 



Construct B = SEQ ID NO:38 (nucleotide) & 46 (polypeptide) 

GCGGCCGCGCCACCATGGCCGCCGCCTACGTGCATAGCGACGGGAGCTACCCCAAGGACA 
MAAAYVHSDGSYPKDK 

AGTTCGAGAAGATCAACGGGACATGGTACTACTTCGACTCCTCCGGCTACATGCTCGCCG 
FEKINGTWYYFDSSGYMLAD 

ACCGCTGGCGGAAGCACACCGACGGCAACTGGTACTGGTTCGATAACTCGGGAGAGATGG 
RWRKHTDGNWYWFDNSGEMA 

CCACCGGCTGGAAGAAGATCGCGGACAAGTGGTACTATTTCAACGAGGAGGGCGCCATGA 
TGWKKIADKWYYPNEEGAMK 

AGACCGGCTGGGTGAAGTATAAGGACACCTGGTACTACCTCGACGCCAAGGAGGGCGCCA 
TGWVKYKDTWYYLDAKEGAM 

TGCAGTATATCAAGGCCAACAGCAAGTTCATCGGCATCACCGAGGGAGTGATGGTCAGCA 
QYIKANSKFIGITEGVMVSN 

ACGCCTTTATCCAGAGCGCCGACGGCACCGGATGGTACTACTTGAAGCCGGACGGCACCC 
AF IQSADGTGWYYLKPDGTL 

TCGCGGATCGGCCCGAGATGGTGCAGCGGCTGTGGGTGTCCCGGCTGCTGCGCCATAGAA 
ADRPEMVQRLWVSRLLRHRK 

AGGCCCAGTTGCTGCTGGTGAACCTGCTGACTTTCGGACTGGAGGTGTGCCTGGCTGCCG 
AQLLLVNLLTFGI.EVCI.AAG 

ITYVPPLLLEVGVEEKFM T M 

TGGTGCTGGGCATCGGCCCCGTCCTGGGCCTCGTGTGTGTGCCCCTCCTCGGGAGTGCGT 
VLGIGPVLGLVCVPLLGSAS 

CCGATCATTGGCGGGGCCGCTACGGCCGCCGCAGACCGTTCATCTGGGCCCTGAGCCTGG 
DH WRGRYGRRRPFIWALSLG 

GCATCCTGCTCTCTCTCTTCCTGATCCCCCGGGCCGGCTGGCTGGCCGGCCTGCTGTGTC 
ILLSLFLIPRAGWLAGLLCP 

CCGACCCCCGCCCTCTGGAGCTGGCCCTCCTGATCCTGGGCGTGGGCCTGCTGGACTTCT 
DPRPLELALLILGVGLLDFC 

GCGGCCAGGTGTGTTTCACTCCCCTGGAGGCTCTGCTCTCCGACCTCTTCCGCGACCCCG 
GQVCFTPLEALLSDLFRDPD 

ACCACTGTAGGCAGGCTTACAGCGTGTACGCCTTCATGATCAGTCTGGGGGGATGCCTGG 
HCRQAYSVYAFMISLGGCLG 

GCTATCTGCTGCCCGCTATOSACTGGGACACCAGCGCCCTGGCCCCCTACCTGGGGACTC 
YLLPAIDWDTSALAPYLGTQ 
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AGGAGGAGTGCCTGTTC^CCTGCTCACCTTGATCTTCCTGACGTGCGTCGCCGCCACCC 
EECLFGLLTLI FLTCVAATL 

TGCTGGTGGCCGAGGAGGCGGCCCTGGGGCCCACCGAGCCCGCCGAGGGCCTGAGCGCTC 
LVAEEAALGPTEPAEGLSAP 

CCAGCCTGAGCCCCCATTGCTGCCCGTGCAGGGCTAGGCTCGCCTTCAGGAATCTGGGCG 
SLSPHCCPCRARLAFRNLGA 

CTTTGCTGCCCCGCCTGCATCAGCTGTGCTGTCGCATGCCTCGCACCCTGCGCCGCCTGT 
LLPRLHQLCCRMPRTLRRLF 



TCGTGGGGGAGGGCCTGTACCAGGGCGTGCCCAGGGCCGAGCCCGGCACCGAGGCTAGGC 
VGEGLYQGVPRAEPGTEARR 



LVMDRLVQRFGTRAVY 



AV.VTASAALTGFTFSALQIL 



EPTEARVVPGRGICLDI/AIL 

TCGACTCCGCCTTCCTGCTCTCCCAGGTGGCGCCCAGCCTGTTCATGGGCAGTATCGTGC 
DSAFLLSQVAPSLFMGS IVQ 

AGCTGAGCCAGAGCGTGACCGCCTACATGGTGAGCGCCGCCGGCCTGGGGTTGGTGGCCA 
LSQSVTAYMVSAAGLGLVAI 

TCTACTTTGCCACCCAGGTCGTGTTCGACAAGAGCGATCTCGCCAAGTATAGCGCCTGAG 
YFATQVVFDKSDLAKYSA* 

GATCC 

Construct C = SEQ ID NO:39 (nucleotide) & 47 (polypeptide) 
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GCGGCCGCGCCACCATGGCCGCCGCCTACGTGCATAGCGACGGGAGCTACCCCAAGGACA 
MAAAYVHSDGSYPKDK 

AGTTCGAGAAGATCAACGGGACATGGTACTACTTCGACTCCTCCGGCTACATGCTCGCCG 
FBKINGTWYYFDSSGYMLAD 

ACCGCTGGCGGAAGCACACCGACGGCAACTGGTACTGGTTCGATAACTCGGGAGAGATGG 
RWRKHTDGNWYWFDNSGEMA 



AGACCGGCTGGGTGAAGTATAAGGACACCTGGTACTACCTCGACGCCAAGGAGGGCGCCA 
TGWVKYKDTWYYliDAKEGAM 



TGCAGTATATCAAGGCCAACAGCAAGTTCATCGGCATCACCGAGGGAGTGATGGTCAGCA 
QYIKANSKFIGITEGVMVSN 



ACGCCTTTATCCAGAGCGCCGACGGCACCGGATGGTACTACTTGAAGCCGGACGGCACCC 
AFIQSADGTGWYYLKPDGTL 



TCGTGTGTGTGCCCCTCCTCGGGAGTGCGTCCGATCATTGGCGGGGCCGCTACGGCCGCC 
VCVPLLGSASDHWRGRYGRR 



R P F I W A L 



ILGVGLLDFCGQVCFTPL 



I1LSDI1 FRDPDHCRQAYSVYA 

CCTTCATGATCAGTCTGGGGGGATGCCTGGGCTATCTGCTGCCCGCTATCGACTGGGACA 
FMISLGGCLGYLLPAIDWDT 



CCAGCGCCCTGGCCCCCTACCTGGGGACTCAGGAGGAGTGCCTGTTCGGCCTGCTCACCT 



S A L A P Y L 



TQEECLFGLliTL 



IFLTCVAATLLVA 



A A L G 



CCACCGAGCCCGCCGAGGGCCTGAGCGCTCCCAGCCTGAGCCCCCATTGCTGCCCGTGCA 



P A E G h 



LSPHCCPCR 



GGGCTAGGCTCGCCTTCAGGAATCTGGGCGCTTTGCTGCCCCGCCTGCATCAGCTGTGCT 
ARLAFRNLGAIiLPRLHQLCC 



R M P R 



XiR RXiFVABLC 



W M A h 



TGATGACGTTCACCCTCTTCTACACCGACTTCGTGGGGGAGGGCCTGTACCAGGGCGTGC 
MTFTLFYTDFVGEGLYQGVP 
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CCAGGGCCGAGCCCGGCACCGAGGCTAGGCGCCATTACGACGAGGGCGTCAGGATGGGCT 
RAEPGTEARRHYDEGVRMGS 

CTCTGGGCCTCTTCCTGCAGTGCGCCATCAGTCTGGTGTTCTCTCTGGTGATGGACCGGC 
LGIjFLQCAI SliVFSLVMDRli 

TGGTGCAGCGCTTCGGCACCCGGGCCGTGTACCTCGCCTCTGTGGCGGCTTTCCCCGTCG 
VQRFGTRAVYLASVAAFPVA 

CCGCCGGCGCGACCTGCCTGTCTCATTCTGTCGCCGTGGTGACCGCCAGCGCCGCCCTGA 
AGATCLSHSVAVVTASAALT 

CCGGCTTCACCTTCAGTGCGCTCCAGATTCTGCCCTACACCCTGGCGTCTCTGTACCATC 
GFTFSALQ.XLPYTLASLYHR 

GCGAGAAGCAGGTGTTCCTGCCCAAGTACCGCGGGGACACAGGGGGAGCTTCCTCTGAGG 
EKQVFLPKYRGDTGGASSED 



VGAGGSGLtiPPPPALCGASA 

CCTGCGACGTGAGCGTGCGGGTGGTGGTGGGGGAGCCCACCGAGGCTAGGGTCGTGCCTG 
CDVSVRVVVGEPTEARVVPG 

GCCGGGGGATCTGCCTGGACCTGGCCATCCTCGACTCCGCCTTCCTGCTCTCCCAGGTGG 
RGICLDJjAILDSAFLliSQVA 

CGCCCAGCCTGTTCATGGGCAGTATCGTGCAGCTGAGCCAGAGCGTGACCGCCTACATGG 
PSLFMGS IVQLSQSVTAYMV 

TGAGCGCCGCCGGCCTGGGGTTGGTGGCCATCTACTTTGCCACCCAGGTCGTGTTCGACA 
SAAGliGLVAIYFATQVVFDK 



SDLAKYSAMVQRLWVSRLLR 



HRKAQLLLVNLLTF 

TGGCTGCCGGGATCACGTACGTGCCCCCCCTGCTGCTGGAGGTGGGCGTGGAGGAGTGAG 
AAGITYVPPLLliEVGVEE* 



Construct D = SEQ ID NO:40 (nucleotide) & 48 (polypeptide) 



GCGGCCGCGCCACCATGGTGCAGCGGCTGTGGGTGTCCCGGCTGCTGCGCCATAGAAAGG 
MVQRLWVSRLIiRHRKA 



PPL.LLEVGVEEMAAAYV 
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TGCATAGCGACGGGAGCTACCCCAAGGACAAGTTCGAGAAGATCAACGGGACATGGTACT 
HSDGS YPKDKFEKINGTWYY 

ACTTCGACTCCTCCGGCTACATGCTCGCCGACCGCTGGCGGAAGCACACCGACGGCAACT 
FDSSGYMLADRWRKHTDGNW 

GGTACTGGTTCGATAACTCGGGAGAGATGGCCACCGGCTGGAAGAAGATCGCGGACAAGT 
YWFDNSGEMATGWKKIADKW 

GGTACTATTTCAACGAGGAGGGCGCCATGAAGACCGGCTGGGTGAAGTATAAGGACACCT 
YYFNE EGAMKTGWVKYKDTW 

GGTACTACCTOSACGCCAAGGAGGGCGCaiTGC^GTATATC^GGCC^ACAGCAAGTTCA 
YYLDAKEGAMQYIKANSKFI 

TCGGCATCACCGAGGGAGTGATGGTCAGCAACGCCTTTATCCAGAGCGCCGACGGCACCG 
GITEGVMVSNAFIQSADGTG 

GATGGTACTACTTGAAGCCGGACGGCACCCTCGCGGATCGGCCCGAGAAGTTCATGTACA 
WYYLKPDGTLADRPEKFMYM 

TGGTGCTGGGCATCGGCCCCGTCCTGGGCCTCGTGTGTGTGCCCCTCCTCGGGAGTGCGT 
VLGIGPVLGLVCVPLIiGSAS 

CCGATCATTGGCGGGGCCGCTACGGCCGCCGCAGACCGTTCATCTGGGCCCTGAGCCTGG 
DHWRGRYGRRRPFIWALSLG 

GCATCCTGCTCTCTCTCTTCCTGATCCCCCGGGCCGGCTGGCTGGCCGGCCTGCTGTGTC 
ILLSLFLI PRAGWLAGiiLCP 

CCGACCCCCGCCCTCTGGAGCTGGCCCTCCTGATCCTGGGCGTGGGCCTGCTGGACTTCT 
DPRPLELALLILGVGLLDFC 

GCGGCCAGGTGTGTTTCACTCCCCTGGAGGCTCTGCTCTCCGACCTCTTCCGCGACCCCG 
GQVCFTPLEALLSDIiFRDPD 

ACCACTGTAGGCAGGCTTACAGCGTGTACGCCTTCATGATCAGTCTGGGGGGATGCCTGG 
HCRQAYSVYAFMISLGGCLG 

GCTATCTGCTGCCCGCTATCGACTGGGACACCAGCGCCCTGGCCCCCTACCTGGGGACTC 
YLLPA IDWDTSALAPYLGTQ 

AGGAGGAGTGCCTGTTCGGCCTGCTCACCTTGATCTTCCTGACGTGCGTCGCCGCCACCC 
EECLFGLLTLIFLTCVAATL 

TGCTGGTGGCCGAGGAGGCGGCCCTGGGGCCCACCGAGCCCGCCGAGGGCCTGAGCGCTC 
LVAEEAAIiGPTEPAEGLSAP 

CCAGCCTGAGCCCCCATTGCTGCCCGTGCAGGGCTAGGCTCGCCTTCAGGAATCTGGGCG 
SLSPHCCPCRARLAFRNLGA 

CTTTGCTGCCCCGCCTGCATCAGCTGTGCTGTCGCATGCCTCGCACCCTGCGCCGCCTGT 
LLPRLHQLCCRMPRTLRRLF 

TCGTCGCTGAGCTCTGTTCCTGGATGGCCCTGATGACGTTCACCCTCTTCTACACCGACT 
VAELCSWMALMTFTLFYTDF 

TCGTGGGGGAGGGCCTGTACCAGGGCGTGCCCAGGGCCGAGCCCGGCACCGAGGCTAGGC 
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VGEGLYQGVPRAEPGTEARR 

GCCATTACGACGAGGGCGTCAGGATC 
HYDEGVRMG 

GTCTGGTGTTCTCTCTGGTGATGGACCGGCTGGTGCAGCGCTTCGGCACCCGGGCCGTGT 
LVFSLVMDRLVQRFGTRAVY 

ACCTCGCCTCTGTGGCGGCTTTCCCCGTCGCCGCCGGCGCGACCTGCCTGTCTCATTCTG 
LASVAAFPVAAGATCLSHSV 



AV VTASAALTG 



PYTLASLYHREKQVFLPKYR 

GCGGGGACACAGGGGGAGCTTCCTCTGAGGACAGCCTGATGACCAGCTTCTTGCCCGGCC 
GDTGGASSEDSLMTS FLPGP 

CCAAGCCGGGGGCCCCTTTCCCCAACGGCCATGTCGGGGCGGGCGGCAGCGGCCTGCTCC 
KPGAPFPNGHVGAGGSGLLP 

CTCCCCCCCCCGCCCTGTGCGGCGCTAGTGCCTGCGACGTGAGCGTGCGGGTGGTGGTGG 
PPP ALCGASACDVSVRVVVG 

GGGAGCCCACCGAGGCTAGGGTCGTGCCTGGCCGGGGGATCTGCCTGGACCTGGCCATCC 
EPTEARVVPGRGICLDLAIL 

TCGACTCCGCCTTCCTGCTCTCCCAGGTGGCGCCCAGCCTGTTCATGGGCAGTATCGTGC 
DSAFLLSQVAPSLFMGSIVQ 

AGCTGAGCCAGAGCGTGACCGCCTACATGGTGAGCGCCGCCGGCCTGGGGTTGGTGGCCA 
LSQSVTAYMVSAAGLGLVA I 

TCTACTTTGCCACCCAGGTCGTGTTCGACAAGAGCGATCTCGCCAAGTATAGCGCCTGAG 
YFATQVVFDKSDLAKYSA* 
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FIG. 21 - Western blot analysis of CHO cells following transient transfection with 
P501S (JNW680), CPC-P501S (JNW735) and empty vector control. 



1 2 3 4 5 




Lane Sample 

1 CPC-P501S (JNW735) 

2 CPC P501S protein (62.5ng) 

3 P501S (JNW680) 

4 P501S (JNW680) 

5 Empty vector control 
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FIG. 22 - Anti-P501S antibody responses following immunisation at dayO, 21 & 42 
with pVAC-P501S (JNW680, mice B1-9} or Empty vector (pVAC, mice A1-6). 




134 



EP1 511 768 B1 



FIG. 23 - Peptide library screen using C57BL/6 mice immunised at day 0, 21, 42, and 
70 with pVAC-P501S (JNW680). 
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FIG. 24 - Cellular responses by ELtSPOT at day 77 following PMID immunisation at 
day 0, 21, 42, and 70 with pVAC-P501S (JNW680, BB-9) and pVAC empty (A4-6). 

Graph A shows the IFN-y responses whilst Graph B shows the IL-2 responses. 
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FIG. 25 - Comparison of P501S and CPC-P501S. 



70 




Control P501S CPC-P501S 



FIG. 26 - Immune response (lymphoproliferation on spleen cells) following protein 
immunisation with CPC-P501S. 
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FIG. 27 - Evaluation of the immune response to different CPC-PS01S constructs 
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FIG.28. MUC1-CPC DNA and polypeptide sequences 
FIG. 28A. DNA sequence (SEQ ID N0.49) 



GTCaTGCUUVGCTCTACCCCAGGTGGAGAAAAGGAGACTTCGGCTACCCAGAGAAGTTCAGTGCCCAGCTCTAC 
TGAGAAGAATGCTGTGAGTATGACCAGCAGCGTACTCTCCAGCCACAGCCCCGGTTCAGGCTCCTCCACCACT 
CAGGGACAGGATGTCACTCTGGCCCCGGCCACGGAACCAGCTTCAGGTTCAGCTGCCACCTGGGGACAGGATG 
TCACCTCGGTCCCAGTCACCAGGCCAGCCCTGGGCTCCACCACCCCGCCAGCCCACGATGTCACCTCAGCCCC 
GGACAACAAGCCAGCCCCGGGCTCCACCGCCCCCCCAGCCCACGGTGTCACCTCGGCCCCGGACACCAGGCCG 
CCCCCGGGCTCCACCGCCCCCCCAGCCCACGGTGTCACCTCGGCCCCGGACACCAGGCCGCCCCCGGGCTCCA 
CCGCGCCCGCAGCCCACGGTGTCACCTCGGCCCCGGACACCAGGCCGGCCCCGGGCTCCACCGCCCCCCCAGC 
CC^TGGTGTCACCTCGGCCCCGGACAACAGGCCCGCCTTGGCGTCCACCGCCCCTCCAGTCCACAATGTCACC 
TCGGCCTCAGGCTCTGCATCAGGCTCAGCTTCTACTCTGGTGCACAACGGCACCTCTGCCAGGGCTACCACAA 
CCCCAGCCAGCAAGAGCACTCCATTCTCAATTCCCAGCCACCACTCTGATACTCCTACCACCCTTGCCAGCCA 
TAGCACCAAGACTGATGCCAGTAGCACTCACCATAGCACGGTACCTCCTCTCACCTCCTCCAATCACAGCACT 
TCTCCCCAGTTGTCTACTGGGGTCTCTTTCTTTTTCCTGTCTTTTCACATTTCAAACCTCCAGTTTAATTCCT 
CTCTGGAAGATCCCAGCACCGACTACTACCAAGAGCTGCAGAGAGACATTTCTGAAATGTTTTTGCAGATTTA 
TAAACAAGGGGGTTTTCTGGGCCTCTCCAATATTAAGTTCAGGCCAGGATCTGTGGTGGTACAATTGACTCTG 
GCCTTCCGAGAAGGTACCATCAATGTCCACGACGTGGAGACACAGTTCAATCAGTATAAAACGGAAGCAGCCT 
CTCGATATAACCTGACGATCTCAGACGTCAGCGTGAGTGATGTGCCATTTCCTTTCTCTGCCCAGTCTGGGGC 
TGGGGTGCCAGGCTGGGGCATCGCGCTGCTGGTGCTGGTCTGTGTTCTGGTTGCGCTGGCCATTGTCTATCTC 

ACCATCCTATGAGCGAGTACCCCACCTACCACACCCATGGGCGCTATGTGCCCCCTAGCAGTACCGATCGTAG 
CCCCTATGAGAAGGTTTCTGCAGGTAATGGTGGCAGCAGCCTCTCTTACACAAACCCAGCAGTGGCAGCCACT 
TCTGCCAACTTGATGGCGGCCGCTTACGTACATTCCGACGGCTCTTATCC^^GACAAGTTTGAGAAAATCA 
ATGGCACTTGGTACTACTTTGACAGTTCAGGCTATATGCTTGCAGACCGCTGGAGGAAGCACACAGACGGCAA 
CTGGTACTGGTTCGACAACTCAGGCGAAATGGCTACAGGCTGGAAGAAAATCGCTGATAAGTGGTACTATTTC 
AACGAAGAAGGTGCCATGAAGACAGGCTGGGTCAAGTACAAGGACACTTGGTACTACTTAGACGCTAAAGAAG 
GCGCCATGCAATACATCAAGGCTAACTCTAAGTTCATTGGTATCACTGAAGGCGTCATGGTATCAAATGCCTT 
TATCCAGTCAGCGGACGGAACAGGCTGGTACTACCTCAAACCAGACGGAACACTGGCAGACAGGCCAGAATGA 

FIG. 28B. MUC1-CPC polypeptide sequence (SEQ ID NO.50) 

MTPGTQSPFFLLLLLTVLTVVTGSGHASSTPGGEKETSATQRSSVPSSTEKNAVSMTSSVLSSHSPGSGSSTT 
QGQDVTLAPATEPASGSAATWGQDVTSVPVTRPALGSTTPPAHDVTSAPDNKPAPGSTAPPAHGVTSAPDTRP 
PPGSTAPPAHGVTSAPDTRPPPGSTAPAAHGVTSAPDTRPAPGSTAPPAHGVTSAPDtniPAIjASTAPPVHNVT 
SASGSASGSASTLVHNGTSARATTTPASKSTPFSIPSHHSDTPTTIJ^HSTKTDASSTHHSTVPPLTSSIJHST 
SPQLSTGVSFFFLSFHISNLQPNSSLEDPSTDYyQELQRDISEMFLQIYKQGGFLGI.SNIKFRPGSVWQLTL 
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AFREGTINVHDVBTQFNQYKTBAASRYmiTISDVSVSDVPFPFSAQSGAGVPGWGIALLVLVCVLVAIAIVYL 
IA1AVCQCRRKNYGQ1.DIPPAI^TYHPMSEYPTYHTHGRYVPPSSTDRSPYEKVSAGNGGSSI.SYTNPAVAAT 
SANLMAAAYVHSDGSYPKDKFEKINGTWYYFDSSGYMIADRWR 

NEEGAMKTGWVKY KDTW YYIiDAKEGAMQ Y I KANSKF IG ITEGVMVSNAF I QS ADGTGWY YIiKPDGTLADRPE 



FIG.29. ss-CPC-MUCI construct and sequence 
FIG. 29A. DNA sequence (SEQ ID NO.S1) 

ATGGGATGGAGCTGTATCATCCTCTTCTTGGTAGCARC^GCTAC^GGTGTCCACTCCCAGGTCCyiAATGGCGG 
CCGCTTACGTACATTCCGACGGCTCTTATCO\AAAGACAAGTT^^ 

TGACAGTTCAGGCTATATGCTTGCAGACCGCTGGAGGAAGCACACAGACGGCAACTGGTACTGGTTCGACAAC 
TCAGGCGAAATGGCTACAGGCTGGAAGAAAATCGCTGATAAGTGGTACTATTTCAACGAAGAAGGTGCCATGA 
AGACAGGCTGGGTCAAGTACAAGGACACTTGGTACTACTTAGACGCTAAAGAAGGCGCCATGCAATACATCAA 
GGCTAACTCTAAGTTCATTGGTATCACTGAAGGCGTCATGGTATCAAATGCCTTTATCCAGTCAGCGGACGGA 
ACAGGCTGGTACTACCTCAAACCAGACGGAACACTGGCAGACAGGCCAGAAATGACACCGGGCACCCAGTCTC 
CTTTCTTCCTGCTGCTGCTCCTCACAGTGCTTACAGTTGTTACAGGTTCTGGTCATGCAAGCTCTACCCCAGG 
TN3GAGAAAAGGAGACTTCGGCTACCCAGAGAAGTTCAGTGCCCAGCTCTACTGAGAAGAATGCTGTGAGTATG 
ACCAGCAGCGTACTCTCC^GCCACAGCCCCGGTTCAGGCTCCTCCACCACTCAGGGACAGGATGTCACTCTGG 
CCCCGGCCACGGAACCAGCTTCAGGTTCAGCTGCCACCTGGGGACAGGATGTCACCTCGGTCCCAGTCACCAG 




GACAACAGGCCCGCCTTGGCGTCCACCGCCCCTCCAGTCCACAATGTCACCTCGGCCTCAGGCTCTGCATCAG 
GCTCAGCTTCTACTCTGGTGCACAACGGCACCTCTGCCAGGGCTACCACAACCCCAGCCAGCAAGAGCACTCC 
ATTCTCAATTCCCAGCCACCACTCTGATACTCCTACCACCCTTGCCAGCCATAGCACCAAGACTGATGCCAGT 
AGCACTCACCATAGCACGGTACCTCCTCTCACCTCCTCCAATCACAGCACTTCTCCCCAGTTGTCTACTGGGG 



TCTCTTTCTTTTTCCTGTCTTTTCACATTTCAAACCTCCUiGTTTAATTCCTCTCTGGAAGATCCCAGCACCGA 
CTACTACCAAGAGCTGCAGAGAGACATTTCTGAAATGTTTTTGCAGATTTATAAACAAGGGGGTTTTCTGGGC 

ATGTCCACGACGTGGAGACACAGTTCAATCAGTATAAAACGGAAGCAGCCTCTCGATATAACCTGACGATCTC 

GCGCTGCTGGTGCTGGTCTGTGTTCTGGTTGCGCTGGCCATTGTCTATCTCATTGCCTTGGCTGTCTGTCAGT 
GCCGCCGAAAGAACTACGGGCAGCTGGACATCTTTCCAGCCCGGGATACCTACCATCCTATGAGCGAGTACCC 
CACCTACCACACCCATGGGCGCTATGTGCCCCCTAGCAGTACCGATCGTAGCCCCTATGAGAAGGTTTCTGCA 
GGTAATGGTGGCAGCAGCCTCTCTTACACAAACCCAGCAGTGGCAGCCACTTCTGCCAACTTGTAG 
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FIG. 29B. ss-CPC-MUCI protein sequence Polypeptide sequence (SEQ ID NO.52) 

MGWSCIILFLVATATGVHSQVQMAAAYVHSDGSYPKDKFEKIN^ 
SGEMATGWKKIADKVTOrFNEEGAMKTGWVKYKDTWYY:^^ 

TGWYYLKPDGTIADRPEMTPGTQSPFFLLLl^TVIiTVVTGSGHASSTPGGEKETSATQRSSVPSSTEKNAVSM 
TSSVLSSHSPGSGSSTTQGQDVTIAPATEPASGSAATWGQDVTSVPVTRPALGSTTPPAHDVTSAPDNKPAPG 
STAPPAHGVTSAPDTRPPPGSTAPPAHGVTSAPDTRPPPGSTAPAAHGVTSAPDTRPAPGSTAPPAHGVTSAP 
DNRPALASTAPPVHNVTSASGSASGSASTLVHNGTSARATTTPASK^ 

STHHSTVPPLTSSNHSTSPQI.STGVSFFFI.SFHISNLQFNSSLBDPSTDYYQELQRDISEMFLQIYKOGGPLG 
LSNIKTRPGSVWQLTIAFREGTINVHDVETQFNQYKTEAASRYNLTISDVSVSDVPFPFSAQSGAGVPGWGI 
ALLVLVCVLVAIAIVYLIAIiAVCQCRRKNYGQLDIFPARDTYHPMSEYPTYHTHGRYVPPSSTDRSPYEKVSA 
GNGGS S LS YTNPAVAATS ANL 
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